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[57] 



ABSTRACT 



This invention relates to polynucleotides encoding Glyco- 
protein B from the RFHV/KSHV subfamily of gamma 
herpes viruses, three members of which are characterized in 
detail. DNA extracts were obtained from Macaque nemes- 
trina and Macaque mulatta monkeys affected with retro- 
peritoneal fibromatosis (RF), and human AIDS patients 
affected with Kaposi's sarcoma (KS). The extracts were 
amplified using consensus-degenerate oligonucleotide 
probes designed from known protein and DNA sequences of 
gamma herpes viruses. The nucleotide sequences of a 319 
base pair fragment are about 76% identical between RFHV1 
and KSHV, and about 60—63% identical with the closest 
related gamma herpes viruses outside the RFHV/KSHV 
subfamily. Protein sequences encoded within these frag- 
ments are are about 91% identical between RFHV1 and 
KSHV, and <~65% identical to that of other gamma herpes 
viruses. The full-length KSHV Glycoprotein B sequence 
comprises a transmembrane domain near the N-terminus, 
and a plurality of potentially antigenic sites in the extracel- 
lular domain. Materials and methods are provided to char- 
acterize Glycoprotein B encoding regions of members of the 
RFHV/KSHV subfamily, including but not limited to 
RFHV1, RFHV2, and KSHV Peptides, polynucleotides, and 
antibodies of this invention can be used for diagnosing 
infection, and for eliciting an immune response against 
Glycoprotein B. 

17 Claims, 34 Drawing Sheets 
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is U.S. provisional patent application Ser. No. 60/004,297, 
filed Sep. 26, 1995. This application is also a continuation- 
in-part of U.S. non-provisional application Ser. No. 08/720, 
229, filed Sep. 26, 1996, pending; for which the priority 
application is U.S. provisional patent application Ser. No. 
60/004,297, filed Sep. 26, 1995, now abandoned. This 
application claims priority benefit of all the above- 
referenced applications, which are hereby incorporated 
herein by reference in their entirety. 

FIELD OF THE INVENTION 

20 

The present invention relates generally to the field of 
virology, particularly viruses of the herpes family. More 
specifically, it relates to the identification and characteriza- 
tion of herpes virus Glycoprotein B molecules which are 
associated with fibroproliferative and neoplastic conditions 25 
in primates, including humans. 

BACKGROUND 

Kaposi's Sarcoma is a disfiguring and potentially fatal 3Q 
form of hemorrhagic sarcoma. It is characterized by multiple 
vascular tumors that appear on the skin as darkly colored 
plaques or nodules. At the histological level, it is character- 
ized by proliferation of relatively uniform spindle-shaped 
cells, forming fascicles and vascular slits. There is often 35 
evidence of plasma cells, T cells and monocytes in the 
inflammatory infiltrate. Death may ultimately ensue due to 
bleeding from gastrointestinal lesions or from an associated 
lymphoma. (See generally Martin et al., Finesmith et al.) 

Once a relatively obscure disease, it has leapt to public 40 
attention due to its association with AIDS. As many as 20% 
of certain AIDS -affected populations acquire Kaposi's dur- 
ing the course of the disease. Kaposi's Sarcoma occurs in 
other conditions associated with immunodeficiency, includ- 
ing kidney dialysis and therapeutic immunosuppression. 45 
However, the epidemiology of the disease has suggested that 
immunodeficiency is not the only causative factor. In 
particular, the high degree of association of Kaposi's with 
certain sexual practices suggests the involvement of an 
etiologic agent which is not the human immunodeficiency 50 
virus (Berel et al.). 

A herpes-virus-like DNA sequence has been identified in 
tissue samples from Kaposi's lesions obtained from AIDS 
patients (Chang et al., confirmed by Ambroziuk et al.). The 
sequence was obtained by representational difference analy- 55 
sis (Lisitsyn et al.), in which DNA from affected and 
unaffected tissue were amplified using unrelated priming 
oligonucleotides, and then hybridized together to highlight 
differences between the cells. The sequence was partly 
identical to known sequences of the Epstein Barr Virus and 60 
herpesvirus saimiri. It coded for capsid and tegument 
proteins, two structural components sequestered in the viral 
interior. In a survey of tissues from various sources, the 
sequence was found in 95% of Kaposi's sarcoma lesions, 
regardless of the patients' HIV status (Moore et al. 1995a). 65 
21% of uninvolved tissue from the same patients was 
positive, while 5% of samples from a control population was 



positive. There was approximately 0.5% sequence variation 
between samples. 

The same sequence has been detected in body cavity 
lymphoma, a lymphomatous effusion with B-cell features, 
occurring uniquely in AIDS patients (Cesarman et al.). The 
copy number was higher in body cavity lymphoma, com- 
pared with Kaposi's Sarcoma. Other AIDS -associated lym- 
phomas were negative. The sequence has also been found in 
peripheral blood mononuclear cells of patients with Castle - 
man's disease (Dupin et al.). This is a condition character- 
ized by morphologic features of angiofolicular hyperplasia, 
and associated with fever, adenopathy, and splenomegaly. 
The putative virus from which the sequence is derived has 
become known as Kaposi's Sarcoma associated Herpes 
Virus (KSHV). 

Using PCR in situ hybridization, Boshoff et al. have 
detected KSHV polynucleotide sequences in the cell types 
thought to represent neoplastic cells in Kaposi's sarcoma. 
Serological evidence supports an important role for KSHV 
in the etiology of Kaposi's sarcoma (O'Leary). Kedes et al. 
developed an immunofluorescence serological assay that 
detects antibody to a latency- associated nuclear antigen in B 
cells latently infected with KSHV, and found that KSHV 
seropositivity is high in patients with Kaposi's sarcoma. Gao 
et al. found that of 40 patients with Kaposi's sarcoma, 32 
were positive for antibodies against KSHV antigens by an 
immunoblot assay, as compared with only 7 of 40 homo- 
sexual men without Kaposi's sarcoma immediately before 
the onset of AIDS. Miller et al. prepared KSHV antigens 
from a body cavity lymphoma cell line containing the 
genomes of both KSHV and Epstein-Barr virus. Antibodies 
to one antigen, designated p40, were identified in 32 of 48 
HIV-1 infected patients with Kaposi's sarcoma, as compared 
with only 7 of 54 HIV-1 infected patients without Kaposi's 
sarcoma. 

Zhong et al. analyzed the expression of KSHV sequences 
in affected tissue at the messenger RNA level. Two small 
transcripts were found that represent the bulk of the virus 
specific RNA transcribed from the KSHV genome. One 
transcript was predicted to encode a small membrane pro- 
tein; the other is an unusual poly-A RNA that accumulates 
in the nucleus and may have no protein encoding sequence. 
Messenger RNA was analyzed by cloning a plurality of 
overlapping KSHV genomic fragments that spanned the 
—120 kb KSHV genome from a lambda library of genomic 
DNA. The clones were used as probes for Northern analysis, 
but their sequences were not obtained or disclosed. 

Moore et al. have partially characterized a KSHV genome 
fragment obtained from a body-cavity lymphoma. A 20.7 kb 
region of the genome was reportedly sequenced, although 
the sequence was not disclosed. 17 partial or complete open 
reading frames were present in this fragment, all except one 
having sequence and positional homology to other known 
gamma herpes virus genes, including the capsid maturation 
gene and the thymidine kinase gene. Phylogenetic analysis 
showed that KSHV was more closely related to equine 
herpes virus 2 and Saimiri virus than to Epstein Barr virus. 
The 20.7 kb region did not contain sequences encoding 
either Glycoprotein B or DNA polymerase. 

The herpes virus family as a whole comprises a number 
of multi-enveloped viruses about 100 nm in size, and 
capable of infecting vertebrates. (For general reviews, see, 
e.g., Emery et al., Fields et al.). The double -stranded DNA 
genome is unusually large — from about 88 to about 229 
kilobases in length. It may produce over 50 different tran- 
scripts at various stages in the life cycle of the virus. A 
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number of glycoproteins are expressed at the viral surface, 
and play a role in recognition of a target cell by the virus, and 
penetration of the virus into the cell. These surface proteins 
are relatively more variant between species, compared with 
internal viral components (Karlin et al.). The same surface 5 
proteins are also present on defective viral particles pro- 
duced by cells harboring the virus. One such non- infectious 
form is the L-particle, which comprises a tegument and a 
viral envelope, but lacks the nucleocapsid. 

The herpes virus family has been divided into several 10 
subfamilies. Assignments to each of the categories were 
originally based on biologic properties, and are being refined 
as genomic sequence data emerges. The alpha subfamily 
comprises viruses that have a broad host range, a short 
replicative cycle, and an affinity for the sensory ganglia. 15 
They include the human simplex virus and the Varicella- 
zoster virus. The beta subfamily comprises viruses that have 
a restricted host range, and include Cytomegalovirus and 
human Herpes Virus 6. The gamma subfamily comprises 
viruses that are generally lymphotrophic. The DNA is 20 
marked by a segment of about 110 kilobases with a low GC 
content, flanked by multiple tandem repeats of high GC 
content. The gamma subfamily includes Epstein Barr Virus 
(EBV), herpes virus saimiri, equine Herpes Virus 2 and 5, 
and bovine Herpes Virus 4. 25 

Herpes viruses are associated with conditions that have a 
complex clinical course. A feature of many herpes viruses is 
the ability to go into a latent state within the host for an 
extended period of time. Viruses of the alpha subfamily 
maintain latent forms in the sensory and autonomic ganglia, 30 
whereas those of the gamma subfamily maintain latent 
forms, for example, in cells of the lymphocyte lineage. 
Latency is associated with the transcription of certain viral 
genes, and may persist for decades until conditions are 
optimal for the virus to resume active replication. Such 35 
conditions may include an immunodeficiency. In addition, 
some herpes viruses of the gamma subfamily have the 
ability to genetically transform the cells they infect. For 
example, EBV is associated with B cell lymphomas, oral 
hairy leukoplakia, lymphoid interstitial pneumonitis, and 40 
nasopharyngeal carcinoma. 

A number of other conditions occur in humans and other 
vertebrates that involve fibroproliferation and the generation 
of pre -neoplastic cells. Examples occurring in humans are 
retroperitoneal fibrosis, nodular fibromatosis, pseudosarco- 45 
matous fibromatosis, and sclerosing mesenteritis. Another 
condition known as Enzootic Retroperitoneal Fibromatosis 
(RF) has been observed in a colony of macaque monkeys at 
the University of Washington Regional Primate Research 
Center (Giddens et al.). Late stages of the disease are 50 
characterized by proliferating fibrous tissue around the 
mesentery and the dorsal part of the peritoneal cavity, with 
extension into the inguinal canal, through the diaphragm, 
and into the abdominal wall. Once clinically apparent, the 
disease is invariably fatal within 1—2 months. The condition 55 
has been associated with simian immunodeficiency (SAIDS) 
due to a type D simian retrovirus, SRV-2 (Tsai et al.). 
However, other colonies do not show the same frequency of 
RF amongst monkeys affected with SAIDS, and the fre- 
quency of RF at Washington has been declining in recent 60 
years. 

The study of such conditions in non-human primates is 
important not only as a model for human conditions, but also 
because one primate species may act as a reservoir of viruses 
that affect another species. For example, the herpes virus 65 
saimiri appears to cause no disease in its natural host, the 
squirrel monkey {Saimiri sciureus), but it causes polyclonil 
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T-cell lymphomas and acute leukemias in other primates, 
particularly owl monkeys. 

There is a need to develop reagents and methods for use 
in the detection and treatment of herpes virus infections. The 
etiological linkage between KSHV and Kaposi's sarcoma, 
confirmed by the serological evidence, indicates the impor- 
tance of this need. 

For example, there is a need to develop reagents and 
methods which can be used in the diagnosis and assessment 
of Kaposi's sarcoma, and similar conditions. Being able to 
detect the etiologic agent in a new patient may assist in 
differential diagnosis; being able to assess the level of the 
agent in an ongoing condition may assist in clinical man- 
agement. Desirable markers include those that provide a 
very sensitive indication of the presence of both active and 
latent forms viral infection, analogous to the HBsAg of 
Hepatitis B. Desirable markers also include those that are 
immunogenic, and can be used to assess immunological 
exposure to the viral agent as manifest in the antibody 
response. Glycoprotein antigens from the viral envelope are 
particularly suitable as markers with these characteristics. 
They may be expressed at high abundance near the surface 
not only of replicative forms of the virus, but also on 
L-particles produced by virally infected cells. 

Second, there is a need to develop reagents and methods 
that can be used for treatment of viral infection — both 
prophylactically, and following a viral challenge. Such 
reagents include vaccines that confer a level of immunity 
against the virus. Passive vaccines, such as those comprising 
an anti- virus antibody, may be used to provide immediate 
protection or prevent cell penetration and replication of the 
virus in a recently exposed individual. Active vaccines, such 
as those comprising an immunogenic viral component, may 
be used to elicit an active and ongoing immune response in 
an individual. Antibody elicited by an active vaccine may 
help protect an individual against a subsequent challenge by 
live virus. Cytotoxic T cells elicited by an active vaccine 
may help eradicate a concurrent infection by eliminating 
host cells involved in viral replication. Suitable targets for a 
protective immune response, particularly antibody, are pro- 
tein antigens exposed on the surface of viral particles, and 
those implicated in fusion of the virus with target cells. 

Third, there is a need to develop reagents and methods 
which can be used in the development of new pharmaceu- 
ticals for Kaposi's sarcoma, and similar conditions. The 
current treatment for Kaposi's is radiation in combination 
with traditional chemotherapy, such as vincristine 
(Northfelt, Mitsuyasu). While lesions respond to these 
modalities, the response is temporary, and the downward 
clinical course generally resumes. Even experimental 
therapies, such as treatment with cytokines, are directed at 
the symptoms of the disease rather than the cause. Drug 
screening and rational drug design based upon the etiologic 
agent can be directed towards the long-felt need for a clinical 
regimen with long-term efficacy. Suitable targets for such 
pharmaceuticals are viral components involved in recogni- 
tion and penetration of host cells. These include glycopro- 
tein components of the viral envelope. 

Fourth, there is a need to develop reagents and methods 
which can be used to identify new viral agents that may be 
associated with other fibroproliferative conditions. The rep- 
resentational difference analysis technique used by Chang et 
al. is arduously complex, and probably not appropriate as a 
general screening test. More desirable are a set of oligo- 
nucleotide probes, peptides, and antibodies to be used as 
reagents in more routine assays for surveying a variety of 
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tissue samples suspected of containing a related etiologic 
agent. The reagents should be sufficiently specific to avoid 
identifying unrelated viruses and endogenous components 
of the host, and may be sufficiently cross -re active to identify 
related but previously undescribed viral pathogens. 5 

SUMMARY OF THE INVENTION 

It is an objective of this invention to provide isolated 
polynucleotides, polypeptides, and antibodies derived from 
or reactive with the products of novel genes encoding 
Glycoprotein B molecules of the RFHV/KSHV subfamily of 
herpes viruses. Two members of the family are Retroperi- 
toneal Fibromatosis associated Herpes Virus (RFHV) and 
Kaposi's Sarcoma associated Herpes Virus (KSHV). These 
materials and related methods can be used in the diagnosis 
and treatment of herpes virus infection in primates, includ- 
ing humans. Isolated or recombinant Glycoprotein B frag- 
ments or polynucleotides encoding them may be used as 
components of an active herpes vaccine, while antibodies 
specific for Glycoprotein B may be used as components of 20 
a passive vaccine. 

Accordingly, one of the embodiments of the invention is 
an isolated polynucleotide with a region encoding a Glyco- 
protein B of a herpes virus of the RFHV/KSHV subfamily, 25 
the polynucleotide comprising a sequence of 319 nucle- 
otides at least 65% identical to nucleotides 36 to 354 of SEQ. 
ID NO:l or SEQ. ID NO:3, which are 319 nucleotide 
fragments encoding Glycoprotein B from RFHV and KSHV, 
respectively. Also embodied is an isolated polynucleotide 3Q 
with a region encoding a Glycoprotein B, the polynucleotide 
comprising a sequence selected from the group consisting 
of: a sequence of 35 nucleotides at least 74% identical to 
oligonucleotide SHMDA (SEQ. ID NO:41); a sequence of 
30 nucleotides at least 73% identical to oligonucleotide 35 
CFSSB (SEQ. ID NO:43); a sequence of 29 nucleotides at 
least 72% identical to oligonucleotide ENTFA (SEQ. ID 
NO: 45); and a sequence of 35 nucleotides at least 80% 
identical to oligonucleotide DNIQB (SEQ. ID NO:46). 

Another embodiment of the invention is an isolated 40 
polynucleotide comprising a fragment of at least 21, pref- 
erably 35, more preferably 50, still more preferably 75, and 
even more preferably 100 consecutive nucleotides of the 
Glycoprotein B encoding region of the polynucleotide of the 
preceding embodiments. The polynucleotide is preferably 45 
from a virus capable of infecting primates. Included are 
Glycoprotein B encoding polynucleotide fragments from 
RFHV and KSHV. Another embodiment of the invention is 
an isolated polynucleotide comprising a linear sequence of 
at least about 21 nucleotides identical to a the Glycoprotein 50 
B encoding sequence between nucleotides 36 to 354 inclu- 
sive of SEQ. ID NO:l, SEQ. ID NO:3, or SEQ. ID NO:92, 
or anywhere within SEQ. ID NO:96, but not in SEQ. ID 
NO:98. 

A further embodiment of this invention is an isolated 55 
polypeptide encoded by any of the previous embodiments. 
Also embodied is an isolated polypeptide, comprising a 
linear sequence of at least 17 amino acids essentially iden- 
tical to the Glycoprotein B protein sequence shown in SEQ. 
ID NO:2, SEQ. ID NO:4, or SEQ. ID NO:97, or anywhere 60 
within SEQ. ID NO:94 (KSHV), but not in SEQ. ID NO:99. 
This includes fusion polypeptides, immunogenic 
polypeptides, and polypeptides occurring in glycosylated 
and unglycosylated form. Some preferred antigen peptides 
are listed in SEQ. ID NOS: 67-76. Also embodied are 65 
isolated and non-naturally occurring polynucleotides encod- 
ing any of the aforementioned polypeptides, along with 
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cloning vectors, expression vectors and transfected host 
cells derived therefrom. Further embodiments are method 
for producing polynucleotides or polypeptides of this 
invention, comprising replicating vectors of the invention or 
expressing polynucleotides in suitable host cells. 

Yet another embodiment of this invention is a monoclonal 
or isolated polyclonal antibody specific for a Glycoprotein B 
polypeptide embodied in this invention, or a Glycoprotein B 
encoded in the encoding region of a polynucleotide embod- 
ied in this invention. The antibodies are specific for mem- 
bers of the RFHV/KSHV subfamily, and do not cross-react 
with more distantly related Glycoprotein B sequences, par- 
ticularly SEQ. ID NOS:30-41. 

Still another embodiment of this invention is a vaccine 
comprising a polypeptide of this invention in a pharmaceu- 
tically compatible excipient, and optionally also comprising 
an adjuvant. In certain embodiments, the polypeptide of the 
vaccine comprises an RGD sequence. Another embodiment 
of this invention is a vaccine comprising a polynucleotide of 
this invention, which may be in the form of a live virus or 
viral expression vector. Another embodiment of this inven- 
tion is a vaccine comprising an antibody of this invention in 
a pharmaceutically compatible excipient. Other embodi- 
ments are methods for treating a herpes virus infection, 
either prophylactically or during an ongoing infection, com- 
prising administering one of the aforementioned embodi- 
ments. 

Also embodied in this invention are methods of inhibiting 
attachment of a herpes virus to a cell, or preventing infection 
or pathology due to a member of the RFHV/KSHV virus 
subfamily, comprising contacting the cell or introducing into 
the environment a polypeptide according to this invention 
comprising an RGD sequence. 

Further embodiments of this invention are oligonucle- 
otides specific for Glycoprotein B encoding sequences of the 
gamma herpes subfamily, the RFHV/KSHV subfamily, 
RFHV, and KSHV, especially those listed in SEQ. ID 
NOS: 24— 63. Also embodied are methods for obtaining an 
amplified copy of a polynucleotide encoding a Glycoprotein 
B, comprising contacting the polynucleotide with one or 
more of the aforementioned oligonucleotides. The poly- 
nucleotide to be amplified may be taken from an individual 
affected with a disease featuring fibroblast proliferation and 
collagen deposition, including but not limited to Retroperi- 
toneal Fibromatosis or Kaposi's Sarcoma, or a malignancy 
of the lymphocyte lineage. 

Additional embodiments of this invention are methods for 
detecting viral DNA or RNA in a sample. One method 
comprises the steps of contacting the DNA or RNA in the 
sample with a probe comprising a polynucleotide or oligo- 
nucleotide of this invention under conditions that would 
permit the probe to form a stable duplex with a polynucle- 
otide having the sequence shown in SEQ. ID NO:l or SEQ. 
ID NO: 3, or both, but not with a polynucleotide having a 
sequence of herpes viruses outside the RFHV/KSHV 
subfamily, particularly SEQ. ID NOS: 5— 13, and detecting 
the presence of any duplex formed thereby. The conditions 
referred to are a single set of reaction parameters, such as 
incubation time, temperature, solute concentrations, and 
washing steps, that would permit the polynucleotide to form 
a stable duplex if alternatively contacted with a polynucle- 
otide with SEQ. ID NO:l, or with a polynucleotide with 
SEQ. ID NO:3, or with both, but not with a polynucleotide 
of any of SEQ ID NO:5— 13. Another method comprises the 
steps of amplifying the DNA or RNA in the sample using an 
oligonucleotide of this invention as a primer in the ampli- 
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fication reaction, and detecting the presence of any amplified 
copies. Also embodied are isolated polynucleotides identi- 
fied by the aforementioned methods, as may be present in 
the genome of a naturally occurring virus or affected tissue. 

Further embodiments of this invention are diagnostic kits 5 
for detecting components related to herpes virus infection in 
a biological sample, such as may be obtained from an 
individual suspected of harboring such an infection, com- 
prising a polynucleotide, oligonucleotide, polypeptide, or 
antibody of this invention in suitable packaging. Also 10 
embodied are methods of detecting infection of an 
individual, comprising applying the reagents, methods, or 
kits of this invention on biological samples obtained from 
the individual. 

Still other embodiments of this invention are therapeutic 15 
compounds and compositions for use in treatment of an 
individual for infection by a gamma herpes virus. Included 
are therapeutic agents that comprise polynucleotides and 
vectors of this invention for the purpose of gene therapy. 
Also included are pharmaceutical compounds identified by 20 
contacting a polypeptide embodied in this invention with the 
compound and determining whether a biochemical function 
of the polypeptide is altered. Also included are pharmaceu- 
tical compounds obtained from rational drug design, based 
on structural and biochemical features of a Glycoprotein B 25 
molecule. 

BRIEF DESCRIPTION OF THE FIGURES 

FIG. 1 is a listing of polynucleotide sequences amplified 
from a Glycoprotein B encoding region of RFHV and 30 
KSHV. The 319-base polynucleotide segment between resi- 
dues 36 to 354 is underlined, and represents the respective 
viral gene segment between the primers used to amplify it. 
Aligned with the polynucleotide sequences are oligonucle- 
otides that may be used as hybridization probes or PCR 35 
primers. Type 1 oligonucleotides comprise a gamma herpes 
consensus sequence, and can be used to amplify a Glyco- 
protein B gene segment of a gamma herpes virus. Examples 
shown are NIVPA and TVNCB. Type 2 oligonucleotides 
comprise a consensus sequence from the RFHV/KSHV 40 
subfamily, and can be used to amplify Glycoprotein B gene 
segment of a virus belonging to the subfamily. Examples 
shown are SHMDA, CFSSB, ENTFA and DNIQB. The 
other oligonucleotides shown are Type 3 oligonucleotides. 
These comprise sequences taken directly from the RFHV or 45 
KSHV sequence, and are specific for sequences from the 
respective virus. Oligonucleotides that initiate amplification 
in the direction of the coding sequence (with designations 
ending in "A") are listed 5' — >3'. Oligonucleotides that ini- 
tiate amplification in the direction opposite to that of the 50 
coding sequence (with designations ending in "B") are listed 
3'— >5'. Also shown are the polypeptides encoded by the 
RFHV and KSHV polynucleotide sequences. The aspar- 
agine encoded by nucleotides 238—240 in both sequences is 
a potential N-linked glycosylation site conserved with other 55 
herpes viruses. 

FIG. 2 is a map of the Glycoprotein B encoding DNA 
sequence believed to be contained in the KSHV genome, 
and other members of the RFHV/KSHV subfamily. Shown 
are the approximate location of the KSHV Glycoprotein B 60 
sequence described herein. Also shown are the putative 
conserved segments that represent hybridization sites for 
Type 1 consensus/degenerate oligonucleotides useful in 
probing and amplifying Glycoprotein B sequences from 
gamma herpes viruses. 65 

FIGS. 3 A— 3D are listings of some previously known 
herpes virus Glycoprotein B protein sequences, aligned with 
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the complete KSHV Glycoprotein B protein sequence and 
fragments of RFHV1 and RFHV2. Boxed regions indicate 
the putative pre-processing signal sequence and the trans- 
membrane domain. Cysteine residues are underlined. Resi- 
dues that are highly conserved amongst herpes virus Gly- 
coprotein B sequences are underscored with an asterisk (*). 
Cysteines appearing uniquely in the KSHV Glycoprotein B 
are underscored with a bullet (•). 

FIG. 4 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotide 
FRFDA designed therefrom. 

FIG. 5 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotides 
NIVPA and NIVPASQ designed therefrom. 

FIG. 6 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotides 
TVNCA, TVNCB and TVNCB SQ designed therefrom. 

FIG. 7 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotide 
FAYDA designed therefrom. 

FIG. 8 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotides 
IYGKA and IYGKASQ designed therefrom. 

FIG. 9 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotides 
CYSRA and CYSRASQ designed therefrom. 

FIG. 10 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotides 
NIDFB and NMDFBSQ designed therefrom. 

FIG. 11 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotides 
FREYA, FREYB and NVFDA designed therefrom. 

FIG. 12 is a listing of previously known Glycoprotein B 
polynucleotide sequences of gamma herpes viruses, show- 
ing a conserved region, and the Type 1 oligonucleotide 
GGMA designed therefrom. 

FIGS. 13 A and 13B are listings of a portion of the 
Glycoprotein B polynucleotide sequence from RFHV and 
KSHV, aligned with previously known gamma herpes Gly- 
coprotein B polynucleotide sequences. Each shared residue 
is indicated as a period. 

FIG. 14 is a comparison listing of the polypeptide 
sequences of Glycoprotein B from various gamma herpes 
viruses, encoded between the hybridization sites of NIVPA 
and TVNCB in the polynucleotide sequences. The Class II 
sequence fragments shown underlined are predicted to be 
RFHV/KSHV cross -re active antigen peptides. The Class III 
sequences shown in lower case are predicted to be RFHV or 
KSHV virus-specific peptides. 

FIG. 15 is an alignment of the polypeptide sequences of 
Glycoprotein B over a broader spectrum of herpes viruses in 
the gamma, beta, and alpha subfamilies. 

FIG. 16 is a relationship map of Glycoprotein B, based on 
the polypeptide sequences shown in FIG. 15. 

FIGS. 17A and 17B are listings of exemplary Type 2 
(subfamily-specific) oligonucleotides, aligned with the 
nucleotide sequences from which they were derived. 
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FIG. 18 is an approximate map of Glycoprotein B and 
DNA polymerase encoding regions as they appear in the 
KSHV genome, showing the hybridization position of oli- 
gonucleotide primers. 

FIGS. 19A-19D of a KSHV DNA sequence obtained by 5 
amplifying fragments upstream and downstream from the 
sequence in FIG. 1. An open reading frame is shown for the 
complete KSHV Glycoprotein B sequence, flanked by open 
reading frames for the capsid maturation gene and DNA 
polymerase. Underlined in the nucleotide sequence is a 10 
putative Glycoprotein B promoter 

FIG. 20 is a Hopp-Woods antigenicity plot for the 106 
nucleotide Glycoprotein B polypeptide fragment of RFHV 
encoded between NIVPA and TVNCB. Indicated below are 
spans of hydrophobic and antigenic residues in the 15 
sequence. 

FIG. 21 is a Hopp-Woods antigenicity plot for the 106 
nucleotide Glycoprotein B polypeptide fragment of KSHV 
encoded between NIVPA and TVNCB. 

20 

FIG. 22 is a Hopp-Woods antigenicity plot for the com- 
plete Glycoprotein B from KSHV. 

FIG. 23 is a listing of DNA and protein sequences for a 
Glycoprotein B fragment of a third member of the RFHV/ 
KSHV subfamily, designated RFHV2. The 319-base poly- 2 5 
nucleotide segment between residues 36 to 354 is 
underlined, and represents the Glycoprotein B encoding 
segment between the primers used to amplify it. 

FIG. 24 is an alignment of protein sequences showing an 
RGD triplet near the N- terminal of mature KSHV Glyco- 30 
protein B. The upper panel shows alignment of the Glyco- 
protein B with RGD domains in other proteins. The lower 
panel shows predicted signal peptidase cleavage sites for 
producing the mature form of Glycoprotein B. 
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We have discovered and characterized polynucleotides 
encoding Glycoprotein B from herpes viruses of the RFHV/ 
KSHV subfamily. The polynucleotides, oligonucleotides, 
polypeptides and antibodies embodied in this invention are 40 
useful in the diagnosis, clinical monitoring, and treatment of 
herpes virus infections and related conditions. 

The source for the polynucleotide for the RFHV Glyco- 
protein B was affected tissue samples taken from Macaque 
nemestrina monkeys with retroperitoneal fibromatosis 45 
("RF"). The polynucleotide for the KSHV Glycoprotein B 
was obtained from affected tissue samples taken from 
humans with Kaposi's Sarcoma ("KS"). The tissues used for 
the present invention were known to contain genetic mate- 
rial from RFHV or KSHV, because they had previously been 50 
used successfully to clone corresponding DNA Polymerase 
encoding fragments. The amplification of the DNA Poly- 
merase regions have been described in commonly owned 
U.S. patent application Ser. No. 60/001,148. 

In order to amplify the Glycoprotein B sequences from 55 
these samples, we designed oligonucleotides from those of 
other herpes viruses. Glycoprotein B is expected to be less 
well conserved between herpes viruses, because it is exter- 
nally exposed on the viral envelope and therefore under 
selective pressure from the immune system of the hosts they 60 
infect. Accordingly, the oligonucleotides were designed 
from sequences of herpes viruses believed to be most closely 
related to RFHV and KSHV. These two viruses are known 
from the DNA polymerase sequences to be closely related 
gamma type herpes viruses. 65 

Oligonucleotides were designed primarily from Glyco- 
protein B sequences previously known for four gamma 



herpes viruses: sHVl, eHV2, bHV4, mHV68 and hEBV. 
Comparison of the amino acid sequences of these four 
Glycoprotein B molecules revealed nine relatively con- 
served regions. Based on the sequence data, oligonucle- 
otides were constructed comprising a degenerate segment 
and a consensus segment, as described in a following 
section. Three of these oligonucleotides have been used as 
primers in amplification reactions that have yielded frag- 
ments of the RFHV and KSHV Glycoprotein B encoding 
segments from the RF and KS tissue. 

The RFHV and KSHV polynucleotide sequence frag- 
ments obtained after the final amplification step are shown 
in FIG. 1 (SEQ ID NO:l and SEQ. ID NO:3, respectively). 
Included are segments at each end corresponding to the 
hybridizing regions of the NIVPA and TVNCB primers used 
in the amplification. The fragment between the primer 
binding segments is 319 base pairs in length (residues 
36—354), and believed to be an accurate reflection of the 
sequences of the respective Glycoprotein B encoding 
regions of the RFHV and KSHV genomes. 

The 319 base pair Glycoprotein B encoding polynucle- 
otide segment from RFHV is only 60% identical with that 
from sHVl and bHV4, the most closely related sequences 
from outside the RFHV/KSHV subfamily. The 319 base pair 
polynucleotide segment from KSHV is only 63% identical 
with sHVl and bHV4. The segments are 76% identical 
between RFHV and KSHV. 

Also shown are the corresponding predicted amino acid 
sequences (SEQ ID NO:2 and SEQ ID NO:4). The polypep- 
tide sequences are novel, and are partly homologous to 
Glycoprotein B sequences from other herpes viruses. The 
fragments shown are predicted to be about Vs of the entire 
Glycoprotein B sequence. They begin about 80 amino acids 
downstream from the predicted N-terminal methionine of 
the pre-processed protein. There is a potential N-linked 
glycosylation site at position 80 of the amino acid sequence, 
according to the sequence Asn-Xaa-(Thr/Ser). This site is 
conserved between RFHV and KSHV, and is also conserved 
amongst other known gamma herpes viruses. There is also 
a cysteine residue at position 58 that is conserved across 
herpes viruses of the gamma, beta, and alpha subfamilies, 
which may play a role in maintaining the three-dimensional 
structure of the protein. 

The 106 amino acid segment of Glycoprotein B encoded 
by the 319 base pairs between the amplification primers is 
91% identical between RFHV and KSHV, but only 65% 
identical between KSHV and that of bHV4, the closest 
sequence outside the RFHV/KSHV subfamily. 

Glycoprotein B molecules expressed by the RFHV/ 
KSHV herpes virus subfamily are expected to have many of 
the properties described for Glycoprotein B of other herpes 
viruses. Glycoprotein B molecules are generally about 110 
kDa in size, corresponding to about 800—900 amino acids or 
about 2400—2700 base pairs. Hydrophobicity plots indicate 
regions from the N terminus to the C terminus in the 
following order: a hydrophobic region corresponding to a 
membrane-directing leader sequence; a mixed polarity 
region corresponding to an extracellular domain; a hydro- 
phobic region corresponding to a transmembrane domain; 
and another mixed polarity region corresponding to a cyto- 
plasmic domain. 

The full sequence of the KSHV Glycoprotein B, shown in 
FIG. 19, confirms these predictions: The gene encodes about 
845 amino acids including the signal peptide and a trans- 
membrane region near the C- terminus. Cysteine residues are 
conserved with other Glycoprotein B sequences, and an 
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additional potential disulfide may help stabilize the three- 
dimensional structure. 

Glycoprotein B is generally expressed on the envelope of 
infectious and defective viral particles, and on the surface of 
infected cells. It is generally glycosylated, and may com- 5 
prise 5—20 glycosylation sites or more. It is also generally 
expressed as a protein dimer, which assembles during trans- 
location to the surface of the host cell, prior to budding of 
the virus. The site responsible for dimerization appears to be 
located between about amino acid 475 and the membrane 10 
spanning segment (Navarro et al.). 

Previous studies have mapped several biochemical func- 
tions related to infectivity to different regions of the Gly- 
coprotein B molecule. Glycoprotein B and Glycoprotein C 
are both implicated in initial binding of HSV1 and bovine 15 
herpes virus 1 to target cells (Herold et al., Byrne et al.). The 
moiety on the cells recognized by Glycoprotein B appears to 
be heparan sulfate; the binding is inhibitable by fluid-phase 
heparin. Mutants that lack Glycoprotein C can still bind 
target cells, but mutants that lack both Glycoprotein C and 20 
Glycoprotein B are severely impaired in their ability to gain 
access to the cells. 

Another apparently important function is the ability of 
Glycoprotein B to promote membrane fusion and entry of 
the virus into the cell. In human CMV, the fuso genie role 
appears to map to the first hydrophobic domain of Glyco- 
protein B, and may be associated with conserved glycine 
residues within this region (Reschke et al.). In HSV1 
mutants, the ability of Glycoprotein B to promote syncytia 
formation maps to multiple sites in the cytoplasmic domain 
of the protein, near the C-terminus (Kostal et al.). 

In order to exercise some of these more complicated 
functions, it seems likely that Glycoprotein B associates not 
only with a second Glycoprotein B molecule, but with other 35 
components encoded by the virus. For example, the UMA5 
gene product appears to be required for Glycoprotein B 
induced fusion (Haanes et al.). It has been hypothesized that 
Glycoprotein B cooperates with other surface proteins to 
form a hydrophobic fusion pore in the surface of the target 4Q 
cell (Pereira et al.). Glycoprotein B has been found to elicit 
a potent antibody response capable of neutralizing the intact 
virus. Monoclonal antibodies with neutralizing activity may 
be directed against many different sites on the Glycoprotein 
B molecule. 

45 

Consequently, it is expected that the Glycoprotein B 
molecule bears sites that interact with the target cell, help 
promote fusion, and associate with other viral proteins. It is 
predicted that Glycoprotein B molecules of RFHV/KSHV 
subfamily viruses will perform many of the functions of 50 
Glycoprotein B in other species of herpes virus, and bear 
active regions with some of the same properties. Interfering 
with any of these active regions with a drug, an antibody, or 
by mutation, may impair viral infectivity or virulence. 

Subsequent to discovery of the Glycoprotein B of RFHV 55 
and KSHV, a third member of the RFHV/KSHV subfamily 
was identified in a sample of affected tissue from a Macaca 
mulatta (Example 12). This Glycoprotein B is closely related 
but not identical to RFHV, and is designated RFHV2. It is 
predicted that other members of the RFHV/KSHV subfam- 60 
ily will emerge, including some that are pathogenic to 
humans. This disclosure teaches how new members of the 
subfamily can be detected and characterized. 

The homology between Glycoprotein B sequences within 
the RFHV/KSHV subfamily means that the polynucleotides 65 
and polypeptides embodied in this invention are reliable 
markers amongst different strains of the subfamily. The 
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polynucleotides, polypeptides, and antibodies embodied in 
this invention are useful in such applications as the detection 
and treatment of viral infection in an individual, due to 
RFHV, KSHV, or other herpes viruses in the same subfamily. 
The polynucleotides, oligonucleotide probes, polypeptides, 
antibodies, and vaccine compositions relating to Glycopro- 
tein B, and the preparation and use of these compounds, is 
described in further detail in the sections that follow. 
Abbreviations 

The following abbreviations are used herein to refer to 
species of herpes viruses, and polynucleotides and polypep- 
tides derived therefrom: 

TABLE 1 



Abbreviations for Herpes Virus Strains 







Provisional 






Subfamily 


Designation 


Virus 


Assignment 


RFHV 


simian Retroperitoneal Fibromatosis- 


gamma-Herpes Virus 




associated Herpesvirus 




KSHV 


human Kaposi's Sarcoma-associated 






Herpesvirus 




mHV6S 


murine Herpesvirus 68 




bHV4 


bovine Herpesvirus 4 




eHV2 


equine Herpesvirus 2 




sHVl 


saimiri monkey Herpesvirus 1 




hEBV 


human Epstein-Barr Virus 




hCMV 


human CytoMegalo Virus 


b eta-He rp es Virus 


mCMV 


murine CytoMegaloVirus 




gpCMV 


guinea pig CytoMegaloVirus 




hHV6 


human Herpesvirus 6 




hVZV 


human Varicella-Zoster Virus 


alp ha- Herpesvirus 


HSV1 


human Herpes Simplex Virus 1 




HSV2 


human Herpes Simplex Virus 2 




sHVSAS 


simian Herpesvirus AS 




eHVl 


equine Herpesvirus 1 




iHVl 


ictalurid catfish Herpesvirus 





General Definitions 

"Glycoprotein B" is a particular protein component of a 
herpes virus, encoded in the viral genome and believed to be 
expressed at the surface of the intact virus. Functional 
studies with certain species of herpes virus, especially 
I IS VI, hCMV, and bovine herpes virus 1, have implicated 
Glycoprotein B in a number of biochemical functions related 
to viral infectivity. These include binding to components on 
the surface of target cells, such as heparan sulfate, fusion of 
the viral membrane with the membrane of the target cell, 
penetration of the viral capsid into the cell, and formation of 
polynucleated syncytial cells. Glycoprotein B has been 
observed as a homodimer, and may interact with other viral 
surface proteins in order to exert some of its biochemical 
functions. Different biochemical functions, particularly 
heparan sulfate binding and membrane fusion, appear to 
map to different parts of the Glycoprotein B molecule. A 
Glycoprotein B molecule of other herpes viruses, including 
members of the RFHV/KSHV subfamily, may perform any 
or all of these functions. As used herein, the term Glyco- 
protein B includes unglycosylated, partly glycosylated, and 
filly glycosylated forms, and both monomers and polymers. 

As used herein, a Glycoprotein B fragment, region, or 
segment is a fragment of the Glycoprotein B molecule, or a 
transcript of a subregion of a Glycoprotein B encoding 
polynucleotide. The intact Glycoprotein B molecule, or the 
full-length transcript, will exert biochemical functions 
related to viral activity, such as those described above. Some 
or all of these functions may be preserved on the fragment, 
or the fragment may be from a part of the intact molecule 
which is unable to perform these functions on its own. 
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"Glycoprotein B activity" refers to any biochemical func- 
tion of Glycoprotein B, or any biological activity of a herpes 
virus attributable to Glycoprotein B. These may include but 
are not limited to binding of the protein to cells, cell 
receptors such as heparan sulfate, and receptor analogs; viral 5 
binding or penetration into a cell, or cell fusion. 

The term "Glycoprotein B gene" refers to a gene com- 
prising a sequence that encodes a Glycoprotein B molecule 
as defined above. It is understood that a Glycoprotein B gene 
may give rise to processed and altered translation products, 10 
including but not limited to forms of Glycoprotein B with or 
without a signal or leader sequence, truncated or internally 
deleted forms, multimeric forms, and forms with different 
degrees of glycosylation. 

As used herein, a "DNA Polymerase" is a protein or a 15 
protein analog, that under appropriate conditions is capable 
of catalyzing the assembly of a DNA polynucleotide with a 
sequence that is complementary to a polynucleotide used as 
a template. A DNA Polymerase may also have other catalytic 
activities, such as 3—5' exo nuclease activity; any of the 20 
activities may predominate. A DNA Polymerase may require 
association with additional proteins or co-factors in order to 
exercise its catalytic function. 

"RFHV" is a virus of the herpes family detected in the 
tissue samples of Macaque nemestrina monkeys affected 25 
with Retroperitoneal Fibromatosis (RF). RFHV is synony- 
mous with the terms "RFHV1", "RFHVMn", and "RFMn". 
"KSHV" is a virus of the herpes virus family detected in the 
tissue samples of humans affected with Kaposi's Sarcoma 
(KS). A third member of the RFHV/KSHV subfamily is a 30 
virus identified in a %M. mulatta monkey. The virus is 
referred to herein as "RFHV2". "RFHV2" is synonymous 
with the terms "RFHVMm" and "RFMm". 

The "RFHV/KSHV subfamily" is a term used herein to 
refer to a collection of herpes viruses capable of infecting 35 
vertebrate species. The subfamily consists of members that 
have Glycoprotein B sequences that are more closely related 
to that of the corresponding sequences of RFHV or KSHV 
than other herpes viruses, including sHVl, eHV2, bHV4, 
mHV68 and hEBV. Preferably, the polynucleotide encoding 40 
Glycoprotein B comprises a segment that is at least 65% 
identical to that of RFHV (SEQ. ID NO: 1) or KSIIV (SEQ. 
ID NO: 3) between residues 36 and 354; or at least about 
74% identical to the oligonucleotide SHMDA, or at least 
about 73% identical to the oligonucleotide CFSSB, or at 45 
least about 72% identical to the nucleotide ENTFA, or at 
least about 80% identical to the nucleotide DNIQB. RFHV 
and KSHV are exemplary members of the RFHV/KSHV 
subfamily. The RFHV/KSHV subfamily represents a subset 
of the gamma subfamily of herpes viruses. 50 

The terms "polynucleotide" and "oligonucleotide" are 
used interchangeably, and refer to a polymeric form of 
nucleotides of any length, either deoxyribo nucleotides or 
ribonucleotides, or analogs thereof. Polynucleotides may 
have any three-dimensional structure, and may perform any 55 
function, known or unknown. The following are non- 
limiting examples of polynucleotides: a gene or gene 
fragment, exons, introns, messenger RNA (mRNA), transfer 
RNA, ribosomal RNA, ribozymes, cDNA, recombinant 
polynucleotides, branched polynucleotides, plasmids, 60 
vectors, isolated DNA of any sequence, isolated RNA of any 
sequence, nucleic acid probes, and primers. A polynucle- 
otide may comprise modified nucleotides, such as methy- 
lated nucleotides and nucleotide analogs. If present, modi- 
fications to the nucleotide structure may be imparted before 65 
or after assembly of the polymer. The sequence of nucle- 
otides may be interrupted by non-nucleotide components. A 
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polynucleotide may be further modified after 
polymerization, such as by conjugation with a labeling 
component. 

The term polynucleotide, as used herein, refers to both 
double- and single -stranded molecules. Unless otherwise 
specified or required, any embodiment of the invention 
described herein that is a polynucleotide encompasses both 
the double-stranded form and each of two complementary 
single-stranded forms known or predicted to make up the 
double-stranded form. 

In the context of polynucleotides, a "linear sequence" or 
a "sequence" is an order of nucleotides in a polynucleotide 
in a 5' to 3' direction in which residues that neighbor each 
other in the sequence are contiguous in the primary structure 
of the polynucleotide. A "partial sequence" is a linear 
sequence of part of a polynucleotide which is known to 
comprise additional residues in one or both directions. 

"Hybridization" refers to a reaction in which one or more 
polynucleotides react to form a complex that is stabilized via 
hydrogen bonding between the bases of the nucleotide 
residues. The hydrogen bonding may occur by Watson- Crick 
base pairing, Hoogsteen binding, or in any other sequence- 
specific manner. The complex may comprise two strands 
forming a duplex structure, three or more strands forming a 
multi-stranded complex, a single self -hybridizing strand, or 
any combination of these. A hybridization reaction may 
constitute a step in a more extensive process, such as the 
initiation of a PCR, or the enzymatic cleavage of a poly- 
nucleotide by a ribozyme. 

Hybridization reactions can be performed under condi- 
tions of different "stringency". Conditions that increase the 
stringency of a hybridization reaction are widely known and 
published in the art: see, for example, Sambrook Fritsch & 
Maniatis. Examples of relevant conditions include (in order 
of increasing stringency): incubation temperatures of 25° C, 
37° C, 50° C, and 68° C; buffer concentrations of lOxSSC, 
6xSSC, lxSSC, O.lxSSC (where SSC is 0.15 M NaCl and 
15 mM citrate buffer) and their equivalent using other buffer 
systems; formamide concentrations of 0%, 25%, 50%, and 
75%; incubation times from 5 min to 24 h; 1, 2, or more 
washing steps; wash incubation times of 1, 5, or 15 min; and 
wash solutions of 6xSSC, lxSSC, O.lxSSC, or de ionized 
water. 

"T m " is the temperature in degrees Centigrade at which 
50% of a polynucleotide duplex made of complementary 
strands hydrogen bonded in an antiparallel direction by 
Watson-Crick base paring dissociates into single strands 
under the conditions of the experiment. T m may be predicted 
according to standard formula; for example: 

7^=81.5+16.6 log [Na + ]+0.41(% G/C)-0.61(% F)-600/L 

where [Na + ] is the cation concentration (usually sodium ion) 
in mol/L; (% G/C) is the number of G and C residues as a 
percentage of total residues in the duplex; (% F) is the 
percent formamide in solution (wt/vol); and L is the number 
of nucleotides in each strand of the duplex. 

A "stable duplex" of polynucleotides, or a "stable com- 
plex" formed between any two or more components in a 
biochemical reaction, refers to a duplex or complex that is 
sufficiently long-lasting to persist between the formation of 
the duplex or complex, and its subsequent detection. The 
duplex or complex must be able to withstand whatever 
conditions exist or are introduced between the moment of 
formation and the moment of detection, these conditions 
being a function of the assay or reaction which is being 
performed. Intervening conditions which may optionally be 
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present and which may dislodge a duplex or complex 
include washing, heating, adding additional solutes or sol- 
vents to the reaction mixture (such as denaturants), and 
competing with additional reacting species. Stable duplexes 
or complexes may be irreversible or reversible, but must 5 
meet the other requirements of this definition. Thus, a 
transient complex may form in a reaction mixture, but it 
does not constitute a stable complex if it dissociates spon- 
taneously or as a result of a newly imposed condition or 
manipulation introduced before detection. 10 

When stable duplexes form in an antiparallel configura- 
tion between two single-stranded polynucleotides, particu- 
larly under conditions of high stringency, the strands are 
essentially "complementary". A double -stranded polynucle- 
otide can be "complementary" to another polynucleotide, if 15 
a stable duplex can form between one of the strands of the 
first polynucleotide and the second. A complementary 
sequence predicted from the sequence of a single stranded 
polynucleotide is the optimum sequence of standard nucle- 
otides expected to form hydrogen bonding with the single- 20 
stranded polynucleotide according to generally accepted 
base -pairing rules. 

A "sense" strand and an "antisense" strand when used in 
the same context refer to single-stranded polynucleotides 
which are complementary to each other. They may be 25 
opposing strands of a double -stranded polynucleotide, or 
one strand may be predicted from the other according to 
generally accepted base-pairing rules. If not specified, the 
assignment of one or the other strand as "sense" or "anti- 
sense" may be arbitrary. In relation to a polypeptide- 30 
encoding segment of a polynucleotide, the "sense" strand is 
generally the strand comprising the encoding segment. 

When comparison is made between polynucleotides for 
degree of identity, it is implicitly understood that comple- 
mentary strands are easily generated, and the sense or 35 
antisense strand is selected or predicted that maximizes the 
degree of identity between the polynucleotides being com- 
pared. For example, where one or both of the polynucle- 
otides being compared is double -stranded, the sequences are 
identical if one strand of the first polynucleotide is identical 40 
with one strand of the second polynucleotide. Similarly, 
when a polynucleotide probe is described as identical to its 
target, it is understood that it is the complementary strand of 
the target that participates in the hybridization reaction 
between the probe and the target. 45 

A linear sequence of nucleotides is "essentially identical" 
to another linear sequence, if both sequences are capable of 
hybridizing to form duplexes with the same complementary 
polynucleotide. Sequences that hybridize under conditions 
of greater stringency are more preferred. It is understood that 50 
hybridization reactions can accommodate insertions, 
deletions, and substitutions in the nucleotide sequence. 
Thus, linear sequences of nucleotides can be essentially 
identical even if some of the nucleotide residues do not 
precisely correspond or align. Sequences that correspond or 55 
align more closely to the invention disclosed herein are 
comparably more preferred. Generally, a polynucleotide 
region of about 25 residues is essentially identical to another 
region, if the sequences are at least about 85% identical; 
more preferably, they are at least about 90% identical; more 60 
preferably, they are at least about 95% identical; still more 
preferably, the sequences are 100% identical. A polynucle- 
otide region of 40 residues or more will be essentially 
identical to another region, after alignment of homologous 
portions if the sequences are at least about 75% identical; 65 
more preferably, they are at least about 80% identical; more 
preferably, they are at least about 85% identical; even more 
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preferably, they are at least about 90% identical; still more 
preferably, the sequences are 100% identical. 

In determining whether polynucleotide sequences are 
essentially identical, a sequence that preserves the function- 
ality of the polynucleotide with which it is being compared 
is particularly preferred. Functionality can be determined by 
different parameters. For example, if the polynucleotide is to 
be used in reactions that involve hybridizing with another 
polynucleotide, then preferred sequences are those which 
hybridize to the same target under similar conditions. In 
general, the T m of a DNA duplex decreases by about 1° C. 
for every 1% decrease in sequence identity for duplexes of 
200 or more residues; or by about 5° C. for duplexes of less 
than 40 residues, depending on the position of the mis- 
matched residues (see, e.g., Meinkoth et al.). Essentially 
identical sequences of about 100 residues will generally 
form a stable duplex with each other's respective comple- 
mentary sequence at about 20° C. less than T m ; preferably, 
they will form a stable duplex at about 15° C. less; more 
preferably, they will form a stable duplex at about 10° C. 
less; even more preferably, they will form a stable duplex at 
about 5° C. less; still more preferably, they will form a stable 
duplex at about T m . In another example, if the polypeptide 
encoded by the polynucleotide is an important part of its 
functionality, then preferred sequences are those which 
encode identical or essentially identical polypeptides. Thus, 
nucleotide differences which cause a conservative amino 
acid substitution are preferred over those which cause a 
non- conservative substitution, nucleotide differences which 
do not alter the amino acid sequence are more preferred, 
while identical nucleotides are even more preferred. Inser- 
tions or deletions in the polynucleotide that result in inser- 
tions or deletions in the polypeptide are preferred over those 
that result in the down-stream coding region being rendered 
out of phase; polynucleotide sequences comprising no inser- 
tions or deletions are even more preferred. The relative 
importance of hybridization properties and the encoded 
polypeptide sequence of a polynucleotide depends on the 
application of the invention. 

A polynucleotide has the same "characteristics" of 
another polynucleotide if both are capable of forming a 
stable duplex with a particular third polynucleotide under 
similar conditions of maximal stringency. Preferably, in 
addition to similar hybridization properties, the polynucle- 
otides also encode essentially identical polypeptides. 

"Conserved" residues of a polynucleotide sequence are 
those residues which occur unaltered in the same position of 
two or more related sequences being compared. Residues 
that are relatively conserved are those that are conserved 
amongst more related sequences or with a greater degree of 
identity than residues appearing elsewhere in the sequences. 

"Related" polynucleotides are polynucleotides that share 
a significant proportion of identical residues. 

As used herein, a "degenerate" oligonucleotide sequence 
is a designed sequence derived from at least two related 
originating polynucleotide sequences as follows: the resi- 
dues that are conserved in the originating sequences are 
preserved in the degenerate sequence, while residues that are 
not conserved in the originating sequences may be provided 
as several alternatives in the degenerate sequence. For 
example, the degenerate sequence AYASA may be designed 
from originating sequences ATACA and ACAGA, where Y 
is C or T and S is C or G. Y and S are examples of 
"ambiguous" residues. A degenerate segment is a segment of 
a polynucleotide containing a degenerate sequence. 

It is understood that a synthetic oligonucleotide compris- 
ing a degenerate sequence is actually a mixture of closely 
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related oligonucleotides sharing an identical sequence, 
except at the ambiguous positions. Such an oligonucleotide 
is usually synthesized as a mixture of all possible combi- 
nations of nucleotides at the ambiguous positions. Each of 
the oligonucleotides in the mixture is referred to as an 5 
"alternative form". The number of forms in the mixture is 
equal to 

r>< 10 

where is the number of alternative nucleotides allowed at 
each position. 

As used herein, a "consensus" oligonucleotide sequence 15 
is a designed sequence derived from at least two related 
originating polynucleotide sequences as follows: the resi- 
dues that are conserved in all originating sequences are 
preserved in the consensus sequence; while at positions 
where residues are not conserved, one alternative is chosen 20 
from amongst the originating sequences. In general, the 
nucleotide chosen is the one which occurs in the greatest 
frequency in the originating sequences. For example, the 
consensus sequence AAAAA may be designed from origi- 
nating sequences CAAAA, AAGAA, and AAAAT. A con- 25 
sensus segment is a segment of a polynucleotide containing 
a consensus sequence. 

A polynucleotide "fragment" or "insert" as used herein 
generally represents a sub-region of the full-length form, but 
the entire full-length polynucleotide may also be included. 30 

Polynucleotides "correspond" to each other if they are 
believed to be derived from each other or from a common 
ancestor. For example, encoding regions in the genes of 
different viruses correspond if they share a significant degree 
of identity, map to the same location of the genome, or 35 
encode proteins that perform a similar biochemical function. 
Messenger RNA corresponds to the gene from which it is 
transcribed. cDNA corresponds to the RNA from which it 
has been produced, and to the gene that encodes the RNA. 
A protein corresponds to a polynucleotide encoding it, and 40 
to an antibody that is capable of binding it specifically. 

A "probe" when used in the context of polynucleotide 
manipulation refers to an oligonucleotide which is provided 
as a reagent to detect a target potentially present in a sample 
of interest by hybridizing with the target. Usually, a probe 45 
will comprise a label or a means by which a label can be 
attached, either before or subsequent to the hybridization 
reaction. Suitable labels include, but are not limited to 
radioisotopes, fluorochromes, chemiluminescent 
compounds, dyes, and proteins, including enzymes. 50 

A "primer" is an oligonucleotide, generally with a free 
3'-OH group, that binds to a target potentially present in a 
sample of interest by hybridizing with the target, and there- 
after promotes polymerization of a polynucleotide comple- 
mentary to the target. 55 

Processes of producing replicate copies of the same 
polynucleotide, such as PCR or gene cloning, are collec- 
tively referred to herein as "amplification" or "replication". 
For example, single or double-stranded DNA may be rep- 
licated to form another DNA with the same sequence. RNA 60 
may be replicated, for example, by an RNA-directed RNA 
polymerase, or by reverse-transcribing the DNA and then 
performing a PCR. In the latter case, the amplified copy of 
the RNA is a DNA with the identical sequence. 

A "polymerase chain reaction" ("PCR") is a reaction in 65 
which replicate copies are made of a target polynucleotide 
using one or more primers, and a catalyst of polymerization, 
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such as a reverse transcriptase or a DNA polymerase, and 
particularly a thermally stable polymerase enzyme. 
Generally, a PCR involves reiteratively performing three 
steps: "annealing", in which the temperature is adjusted such 
that oligonucleotide primers are permitted to form a duplex 
with the polynucleotide to be amplified; "elongating", in 
which the temperature is adjusted such that oligonucleotides 
that have formed a duplex are elongated with a DNA 
polymerase, using the polynucleotide to which they've 
formed the duplex as a template; and "melting", in which the 
temperature is adjusted such that the polynucleotide and 
elongated oligonucleotides dissociate. The cycle is then 
repeated until the desired amount of amplified polynucle- 
otide is obtained. Methods for PCR are taught in U.S. Pat. 
No. 4,683,195 (Mullis) and U.S. Pat. No. 4,683,202 (Mullis 
et al.). 

A "control element" or "control sequence" is a nucleotide 
sequence involved in an interaction of molecules that con- 
tributes to the functional regulation of a polynucleotide, 
including replication, duplication, transcription, splicing, 
translation, or degradation of the polynucleotide. The regu- 
lation may affect the frequency, speed, or specificity of the 
process, and may be enhancing or inhibitory in nature. 
Control elements are known in the art. For example, a 
"promoter" is an example of a control element. A promoter 
is a DNA region capable under certain conditions of binding 
RNA polymerase and initiating transcription of a coding 
region located downstream (in the 3' direction) from the 
promoter. 

"Operatively linked" refers to a juxtaposition of genetic 
elements, wherein the elements are in a relationship permit- 
ting them to operate in the expected manner. For instance, a 
promoter is operatively linked to a coding region if the 
promoter helps initiate transcription of the coding sequence. 
There may be intervening residues between the promoter 
and coding region so long as this functional relationship is 
maintained. 

The terms "polypeptide", "peptide" and "protein" are 
used interchangeably herein to refer to polymers of amino 
acids of any length. The polymer may be linear or branched, 
it may comprise modified amino acids, and it may be 
interrupted by non- amino acids. The terms also encompass 
an amino acid polymer that has been modified naturally or 
by intervention; for example, disulfide bond formation, 
glycosylation, lipidation, acetylation, phosphorylation, or 
any other manipulation, such as conjugation with a labeling 
component. 

In the context of polypeptides, a "linear sequence" or a 
"sequence" is an order of amino acids in a polypeptide in an 
N-terminal to C-terminal direction in which residues that 
neighbor each other in the sequence are contiguous in the 
primary structure of the polypeptide. A "partial sequence" is 
a linear sequence of part of a polypeptide which is known to 
comprise additional residues in one or both directions. 

A linear sequence of amino acids is "essentially identical" 
to another sequence if the two sequences have a substantial 
degree of sequence identity. It is understood that the folding 
and the biochemical function of proteins can accommodate 
insertions, deletions, and substitutions in the amino acid 
sequence. Thus, linear sequences of amino acids can be 
essentially identical even if some of the residues do not 
precisely correspond or align. Sequences that correspond or 
align more closely to the invention disclosed herein are more 
preferred. It is also understood that some amino acid sub- 
stitutions are more easily tolerated. For example, substitu- 
tion of an amino acid with hydrophobic side chains, aro- 
matic side chains, polar side chains, side chains with a 
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positive or negative charge, or side chains comprising two or 
fewer carbon atoms, by another amino acid with a side chain 
of like properties can occur without disturbing the essential 
identity of the two sequences. Methods for determining 
homologous regions and scoring the degree of homology are 5 
well known in the art; see for example Altschul et al. and 
Henikoff et al. Well-tolerated sequence differences are 
referred to as "conservative substitutions". Thus, sequences 
with conservative substitutions are preferred over those with 
other substitutions in the same positions; sequences with 10 
identical residues at the same positions are still more pre- 
ferred. 

Generally, a polypeptide region will be essentially iden- 
tical to another region, after alignment of homologous 
portions, if the sequences are at least about 92% identical; 15 
more preferably, they are at least about 95% identical; more 
preferably, they are at least about 95% identical and com- 
prise at least another 2% which are either identical or are 
conservative substitutions; more preferably, they are at least 
about 97% identical; more preferably, they are at least about 20 
97% identical, and comprise at least another 2% which are 
either identical or are conservative substitutions; more 
preferably, they are at least about 99% identical; still more 
preferably, the sequences are 100% identical. 

In determining whether polypeptide sequences are essen- 25 
tially identical, a sequence that preserves the functionality of 
the polypeptide with which it is being compared is particu- 
larly preferred. Functionality may be established by different 
parameters, such as enzymatic activity, the binding rate or 
affinity in a substrate -enzyme or receptor-ligand interaction, 30 
the binding affinity with an antibody, and X-ray cry st alio - 
graphic structure. 

A polypeptide has the same "characteristics" of another 
polypeptide if it displays the same biochemical function, 
such as enzyme activity, ligand binding, or antibody reac- 35 
tivity. Preferred characteristics of a polypeptide related to a 
Glycoprotein B or a Glycoprotein B fragment are the ability 
to bind analogs of the cell surface receptor bound by 
Glycoprotein B of other herpes species, the ability to pro- 
mote membrane fusion with a target cell, the ability to 40 
promote viral penetration of the host cell. Also preferred is 
a polypeptide that displays the same biochemical function as 
the polypeptide with which it is being compared, and in 
addition, is believed to have a similar three-dimensional 
conformation, as predicted by computer modeling or deter- 45 
mined by such techniques as X-ray crystallography. 

The "biochemical function", "biological function" or 
"biological activity" of a polypeptide includes any feature of 
the polypeptide detectable by suitable experimental inves- 
tigation. "Altered" biochemical function can refer to a 50 
change in the primary, secondary, tertiary, or quaternary 
structure of the polypeptide; detectable, for example, by 
molecular weight determination, circular dichroism, anti- 
body binding, difference spectroscopy, or nuclear magnetic 
resonance. It can also refer to a change in reactivity, such as 55 
the ability to catalyze a certain reaction, or the ability to bind 
a cof actor, substrate, inhibitor, drug, hapten, or other 
polypeptide. A substance may be said to "interfere" with the 
biochemical function of a polypeptide if it alters the bio- 
chemical function of the polypeptide in any of these ways. 60 

A "fusion polypeptide" is a polypeptide comprising 
regions in a different position in the sequence than occurs in 
nature. The regions may normally exist in separate proteins 
and are brought together in the fusion polypeptide; or they 
may normally exist in the same protein but are placed in a 65 
new arrangement in the fusion polypeptide. A fusion 
polypeptide may be created, for example, by chemical 
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synthesis, or by creating and translating a polynucleotide in 
which the peptide regions are encoded in the desired rela- 
tionship. 

An "antibody" (interchangeably used in plural form) is an 
immunoglobulin molecule capable of specific binding to a 
target, such as a polypeptide, through at least one antigen 
recognition site, located in the variable region of the immu- 
noglobulin molecule. As used herein, the term encompasses 
not only intact antibodies, but also fragments thereof, 
mutants thereof, fusion proteins, humanized antibodies, and 
any other modified configuration of the immunoglobulin 
molecule that comprises an antigen recognition site of the 
required specificity. 

"Immunological recognition" or "immunological reactiv- 
ity" refers to the specific binding of a target through at least 
one antigen recognition site in an immunoglobulin or a 
related molecule, such as a B cell receptor or a T cell 
receptor. 

The term "antigen" refers to the target molecule that is 
specifically bound by an antibody through its antigen rec- 
ognition site. The antigen may, but need not be chemically 
related to the immunogen that stimulated production of the 
antibody. The antigen may be polyvalent, or it may be a 
monovalent hapten. Examples of kinds of antigens that can 
be recognized by antibodies include polypeptides, 
polynucleotides, other antibody molecules, 
oligosaccharides, complex lipids, drugs, and chemicals. 

An "immunogen" is a compound capable of stimulating 
production of an antibody when injected into a suitable host, 
usually a mammal. Compounds with this property are 
described as "immunogenic". Compounds may be rendered 
immunogenic by many techniques known in the art, includ- 
ing crosslinking or conjugating with a carrier to increase 
valency, mixing with a mitogen to increase the immune 
response, and combining with an adjuvant to enhance pre- 
sentation. 

A "vaccine" is a pharmaceutical preparation for human or 
animal use, which is administered with the intention of 
conferring the recipient with a degree of specific immuno- 
logical reactivity against a particular target, or group of 
targets. The immunological reactivity may be antibodies or 
cells (particularly B cells, plasma cells, T helper cells, and 
cytotoxic T lymphocytes, and their precursors) that are 
immunologically reactive against the target, or any combi- 
nation thereof. Possible targets include foreign or pathologi- 
cal compounds, such as an exogenous protein, a pathogenic 
virus, or an antigen expressed by a cancer cell. The immu- 
nological reactivity may be desired for experimental 
purposes, for the treatment of a particular condition, for the 
elimination of a particular substance, or for prophylaxis 
against a particular condition or substance. Unless specifi- 
cally indicated, a vaccine referred to herein may be either a 
passive vaccine or an active vaccine, or it may have the 
properties of both. 

A "passive vaccine" is a vaccine that does not require 
participation of the recipient's immune response to exert its 
effect. Usually, it is comprised of antibody molecules reac- 
tive against the target. The antibodies may be obtained from 
a donor subject and sufficiently purified for administration to 
the recipient, or they may be produced in vitro, for example, 
from a culture of hybridoma cells, or by genetically engi- 
neering a polynucleotide encoding an antibody molecule. 

An "active vaccine" is a vaccine administered with the 
intention of eliciting a specific immune response within the 
recipient, that in turn has the desired immunological reac- 
tivity against the target. An active vaccine comprises a 
suitable immunogen. The immune response that is desired 
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may be either humoral or cellular, systemic or secretory, or 
any combination of these. 

A "reagent" polynucleotide, polypeptide, or antibody, is a 
substance provided for a reaction, the substance having 
some known and desirable parameters for the reaction. 5 

A reaction mixture may also contain a "target", such as a 
polynucleotide, antibody, or polypeptide that the reagent is 
capable of reacting with. For example, in some types of 
diagnostic tests, the amount of the target in a sample is 
determined by adding a reagent, allowing the reagent and 10 
target to react, and measuring the amount of reaction prod- 
uct. In the context of clinical management, a target may also 
be a cell, collection of cells, tissue, or organ that is the object 
of an administered substance, such as a pharmaceutical 
compound. A cell that is a target for a viral infection is one 15 
to which a virus preferentially localizes for such purposes as 
replication or transformation into a latent form. 

An "isolated" polynucleotide, polypeptide, protein, 
antibody, or other substance refers to a preparation of the 
substance devoid of at least some of the other components 20 
that may also be present where the substance or a similar 
substance naturally occurs or is initially obtained from. 
Thus, for example, an isolated substance may be prepared by 
using a purification technique to enrich it from a source 
mixture. Enrichment can be measured on an absolute basis, 25 
such as weight per volume of solution, or it can be measured 
in relation to a second, potentially interfering substance 
present in the source mixture. Increasing enrichments of the 
embodiments of this invention are increasingly more pre- 
ferred. Thus, for example, a 2-fold enrichment is preferred, 30 
10-fold enrichment is more preferred, 100-fold enrichment 
is more preferred, 1000-fold enrichment is even more pre- 
ferred. A substance can also be provided in an isolated state 
by a process of artificial assembly, such as by chemical 
synthesis or recombinant expression. 35 

A polynucleotide used in a reaction, such as a probe used 
in a hybridization reaction, a primer used in a PCR, or a 
polynucleotide present in a pharmaceutical preparation, is 
referred to as "specific" or "selective" if it hybridizes or 
reacts with the intended target more frequently, more 40 
rapidly, or with greater duration than it does with alternative 
substances. Similarly, a polypeptide is referred to as "spe- 
cific" or "selective" if it binds an intended target, such as a 
ligand, hapten, substrate, antibody, or other polypeptide 
more frequently, more rapidly, or with greater duration than 45 
it does to alternative substances. An antibody is referred to 
as "specific" or "selective" if it binds via at least one antigen 
recognition site to the intended target more frequently, more 
rapidly, or with greater duration than it does to alternative 
substances. A polynucleotide, polypeptide, or antibody is 50 
said to "selectively inhibit" or "selectively interfere with" a 
reaction if it inhibits or interferes with the reaction between 
particular substrates to a greater degree or for a greater 
duration than it does with the reaction between alternative 
substrates. 55 

A "pharmaceutical candidate" or "drug candidate" is a 
compound believed to have therapeutic potential, that is to 
be tested for efficacy. The "screening" of a pharmaceutical 
candidate refers to conducting an assay that is capable of 
evaluating the efficacy and/or specificity of the candidate. In 60 
this context, "efficacy" refers to the ability of the candidate 
to affect the cell or organism it is administered to in a 
beneficial way: for example, the limitation of the pathology 
due to an invasive virus. 

The "effector component" of a pharmaceutical prepara- 65 
tion is a component which modifies target cells by altering 
their function in a desirable way when administered to a 
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subject bearing the cells. Some advanced pharmaceutical 
preparations also have a "targeting component", such as an 
antibody, which helps deliver the effector component more 
efficaciously to the target site. Depending on the desired 
action, the effector component may have any one of a 
number of modes of action. For example, it may restore or 
enhance a normal function of a cell, it may eliminate or 
suppress an abnormal function of a cell, or it may alter a 
cell's phenotype. Alternatively, it may kill or render dormant 
a cell with pathological features, such as a virally infected 
cell. Examples of effector components are provided in a later 
section. 

A "cell line" or "cell culture" denotes higher eukaryotic 
cells grown or maintained in vitro. It is understood that the 
descendants of a cell may not be completely identical (either 
morphologically, genotypically, or phenotypically) to the 
parent cell. 

A "host cell" is a cell which has been transformed, or is 
capable of being transformed, by administration of an exog- 
enous polynucleotide. A "host cell" includes progeny of the 
original transformant. 

"Genetic alteration" refers to a process wherein a genetic 
element is introduced into a cell other than by natural cell 
division. The element may be heterologous to the cell, or it 
may be an additional copy or improved version of an 
element already present in the cell. Genetic alteration may 
be effected, for example, by transfecting a cell with a 
recombinant plasmid or other polynucleotide through any 
process known in the art, such as electroporation, calcium 
phosphate precipitation, contacting with a polynucleotide - 
liposome complex, or by transduction or infection with a 
DNA or RNA virus or viral vector. The alteration is prefer- 
ably but not necessarily inheritable by progeny of the altered 
cell. 

An "individual" refers to vertebrates, particularly mem- 
bers of a mammalian species, and includes but is not limited 
to domestic animals, sports animals, and primates, including 
humans. 

The term "primate" as used herein refers to any member 
of the highest order of mammalian species. This includes 
(but is not limited to) prosimians, such as lemurs and lorises; 
tarsioids, such as tarsiers; new-world monkeys, such as 
squirrel monkeys (Saimiri sciureus) and tamarins; old-world 
monkeys such as macaques (including Macaca nemestrina, 
Macaca fascicularis, and Macaca fuscata); hylobatids, such 
as gibbons and siamangs; pongids, such as orangutans, 
gorillas, and chimpanzees; and hominids, including humans. 

The "pathology" caused by a herpes virus infection is 
anything that compromises the well-being or normal physi- 
ology of the host. This may involve (but is not limited to) 
destructive invasion of the virus into previously uninfected 
cells, replication of the virus at the expense of the normal 
metabolism of the cell, generation of toxins or other unnatu- 
ral molecules by the virus, irregular growth of cells or 
intercellular structures (including fibrosis), irregular or sup- 
pressed biological activity of infected cells, malignant 
transformation, interference with the normal function of 
neighboring cells, aggravation or suppression of an inflam- 
matory or immunological response, and increased suscepti- 
bility to other pathogenic organisms and conditions. 

"Treatment" of an individual or a cell is any type of 
intervention in an attempt to alter the natural course of the 
individual or cell. For example, treatment of an individual 
may be undertaken to decrease or limit the pathology caused 
by a herpes virus infecting the individual. Treatment 
includes (but is not limited to) administration of a 
composition, such as a pharmaceutical composition, and 
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may be performed either prophylactically, or therapeutically, 
subsequent to the initiation of a pathologic event or contact 
with an etiologic agent. 

It is understood that a clinical or biological "sample" 
encompasses a variety of sample types obtained from a 5 
subject and useful in an in vitro procedure, such as a 
diagnostic test. The definition encompasses solid tissue 
samples obtained as a surgical removal, a pathology 
specimen, or a biopsy specimen, tissue cultures or cells 
derived therefrom and the progeny thereof, and sections or 10 
smears prepared from any of these sources. Non-limiting 
examples are samples obtained from infected sites, fibrotic 
sites, unaffected sites, and tumors. The definition also 
encompasses blood, spinal fluid, and other liquid samples of 
biologic origin, and may refer to either the cells or cell 15 
fragments suspended therein, or to the liquid medium and its 
solutes. The definition also includes samples that have been 
solubilized or enriched for certain components, such as 
DNA, RNA, protein, or antibody. 

Oligonucleotide primers and probes described herein 20 
have been named as follows: The first part of the designation 
is the single amino acid code for a portion of the conserved 
region of the polypeptide they are based upon, usually 4 
residues long. This is followed with the letter A or B, 
indicating respectively that the oligonucleotide is comple- 25 
mentary to the sense or anti-sense strand of the encoding 
region. Secondary consensus oligonucleotides used for 
sequencing and labeling reactions have the letters SQ at the 
end of the designation. 

General techniques 30 

The practice of the present invention will employ, unless 
otherwise indicated, conventional techniques of molecular 
biology, microbiology, recombinant DNA, and immunology, 
which are within the skill of the art. Such techniques are 
explained fully in the literature. See, for example, "Molecu- 35 
lar Cloning: A Laboratory Manual", Second Edition 
(Sambrook, Fritsch & Maniatis, 1989), "Oligonucleotide 
Synthesis" (M. J. Gait, ed., 1984), "Animal Cell Culture" (R. 
I. Freshney, ed., 1987); the series "Methods in Enzymology" 
(Academic Press, Inc.); "Handbook of Experimental Immu- 40 
nology" (D. M. Weir & C. C. Blackwell, eds.), "Gene 
Transfer Vectors for Mammalian Cells" (J. M. Miller & M. 
R Calos, eds., 1987), "Current Protocols in Molecular Biol- 
ogy" (F. M. Ausubel et al., eds., 1987); and "Current 
Protocols in Immunology" (J. E. Coligan et al., eds., 1991). 45 

All patents, patent applications, articles and publications 
mentioned herein, both supra and infra, are hereby incorpo- 
rated herein by reference. 

Polynucleotides encoding Glycoprotein B of the herpes 
virus RFHV/KSHV subfamily 50 

This invention embodies isolated polynucleotide seg- 
ments derived from Glycoprotein B genes present in herpes 
viruses that encode a fragment of a Glycoprotein B polypep- 
tide. The polynucleotides are related to the RFHV/KSHV 
subfamily of herpes viruses. Exemplary polynucleotides 55 
encode Glycoprotein B fragments from either RFHV or 
KSHV. Preferred fragments include those shown in FIG. 1, 
and subfragments thereof, obtained as described in the 
Example section below. Especially preferred is the poly- 
nucleotide comprising the sequence between residues 60 
36-354 of SEQ. ID NO:l, SEQ. ID NO:3, or SEQ. ID 
NO:96, or polynucleotides contained in SEQ. ID NO:92. 

The polynucleotide segments of RFHV and KSHV 
between residues 36 and 354 are 76% identical. Shared 
residues are indicated in FIG. 1 by "*". The longest sub re- 65 
gions that are identically shared between RFHV and KSHV 
within this segment are 15, 17, and 20 nucleotides in length. 
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The 319 base pair fragments of RFHV and KSHV 
between the amplification primer binding sites are more 
identical to each other than either of them are to that of any 
previously sequenced herpes virus. The next most closely 
related sequences are sHVl and bHV4, which are 63% 
identical to the corresponding sequence of KSHV, and 60% 
identical to the corresponding sequence of RFHV. The 
longest number of consecutive bases shared between the 
Glycoprotein B fragment and any of the previously 
sequenced viruses is 14. It is believed that any sub fragment 
of the RFHV or KSHV sequence of 16 base pairs or longer 
will be unique to the RFHV/KSHV subfamily, or to par- 
ticular herpes virus species and variants within the subfam- 
ily. 

This invention embodies subfragments contained in the 
Glycoprotein B gene of the RFHV/KSHV subfamily, pref- 
erably contained in the region corresponding to the 319 base 
pair fragment between residues 36—354 of SEQ. ID NO:l, 
SEQ. ID NO:3, or SEQ. ID NO:96, or anywhere in SEQ. ID 
NO: 92. Preferably, the sub-fragments are at least about 16 
nucleotides in length; more preferably they are at least 18 
nucleotides in length; more preferably they are at least 21 
nucleotides in length; more preferably they are at least about 
25 nucleotides in length; more preferably they are at least 
about 35 nucleotides in length; still more preferably they are 
at least about 50 nucleotides in length; yet more preferably 
they are at least about 75 nucleotides in length, and even 
more preferably they are 100 nucleotides in length or more. 
Also embodied in this invention are polynucleotides com- 
prising the entire open reading frame of each respective 
herpes virus Glycoprotein B. 

The RFHV/KSHV subfamily consists of members that 
have sequences that are more closely identical to the corre- 
sponding sequences of RFHV or KSHV, than RFHV or 
KSHV are to any other virus listed in Table 1. Preferred 
members of the family may be identified on the basis of the 
sequence of the Glycoprotein B gene in the region corre- 
sponding to that of FIG. 1. Table 2 provides the degree of 
sequence identities in this region: 

TABLE 2 



Sequence Identities Between Glycoprotein B of KSHV 



and other Herpes Viruses 








Identity to polynucleotide fragment: 








RFHV 


KSHV 


Glycoprotein B 


SEQ. 


(SEQ. ID NO:l) 


(SEQ. ID NO:3) 


Sequence 


ID NO: 


Bases 36-354 


Bases 36-354 


RFHV/KSHV 


RFHV 


1 


(100%) 


76% 


subfamily 


KSHV 


3 


76% 


(100%) 


Other gamma 


sHVl 


5 


60% 


63% 


herpes viruses 


bHV4 


6 


60% 


63% 




eHV2 


7 


52% 


54% 




mHV6S 


8 


56% 


54% 




hEBV 


9 


<50% 


52% 


alpha and beta 


hCMV 


10 


<50% 


<50% 


herpes viruses 


hHV6 


11 


<50% 


<50% 




hVZV 


12 


<50% 


<50% 




HSV1 


13 


<50% 


<50% 



The percentage of sequence identity is calculated by first 
aligning the encoded amino acid sequence, determining the 
corresponding alignment of the encoding polynucleotide, 
and then counting the number of residues shared between 
the sequences being compared at each aligned position. No 
penalty is imposed for the presence of insertions or 
deletions, but insertions or deletions are permitted only 
where required to accommodate an obviously increased 
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number of amino acid residues in one of the sequences being 
aligned. Offsetting insertions just to improve sequence 
alignment are not permitted at either the polypeptide or 
polynucleotide level. Thus, any insertions in the polynucle- 
otide sequence will have a length which is a multiple of 3. 
The percentage is given in terms of residues in the test 
sequence that are identical to residues in the comparison or 
reference sequence. 

Preferred Glycoprotein B encoding polynucleotide 
sequences of this invention are those derived from the 
RFHV/KSHV herpes virus subfamily. They include those 
sequences that are at least 65% identical with the RFHV or 
KSHV sequence between bases 36 and 354; more 
preferably, the sequences are at least 67% identical; more 
preferably, the sequences are at least about 70% identical; 
more preferably, the sequences are at least about 75% 
identical; more preferably, the sequences are at least about 
80% identical; more preferably, the sequences are at least 
about 85% identical; more preferably, the sequences are at 
least about 90% identical; even more preferably, the 
sequences are over 95% identical. Also included are Gly- 
coprotein B encoding regions that are upstream or down- 
stream of a region fulfilling the identity criteria indicated. 

Other preferred Glycoprotein B encoding polynucleotide 
sequences may be identified by the percent identity with 
RFHV/KSHV sub family -specific oligonucleotides (Type 2 
oligonucleotides) described in more detail in a later section. 
The percent identity of RFHV and KSHV Glycoprotein B 
with exemplary Type 2 oligonucleotides is shown in Table 3: 

TABLE 3 

Sequence Identities Between Glycoprotein B of Select 
Herpes Viruses and RFHV/KSHV Subfamily Specific Oligonucleotides 

Identity to Identity to Identity to Identity to ^5 



Glycoprotein 
B Sequence 


SEQ 
ID NO: 


SHMDA 
(SEQ ID 
NO:41) 


CFSSB 
(SEQ ID 
NO: 43) 


ENTFA 
(SEQ ID 
NO: 45) 


DNIQB 
(SEQ ID 
NO:46) 


RFHV 


1 


91% 


91% 


89% 


91% 


KSHV 


3 


100% 


85% 


89% 


97% 


sHVl 


5 


71% 


70% 


66% 


66% 


bHV4 


6 


57% 


64% 


69% 


74% 


eHV2 


7 


57% 


61% 


54% 


60% 


mHV6S 


8 


<50% 


55% 


54% 


77% 


hEBV 


9 


57% 


55% 


60% 


51% 


hCMV 


10 


57% 


55% 


60% 


51% 


hHV6 


11 


<50% 


52% 


60% 


57% 


hVZV 


12 


54% 


58% 


66% 


57% 


HSV1 


13 


57% 


60% 


54% 


54% 



Percent identity is calculated for oligonucleotides of this 
length by not allowing gaps in either the oligonucleotide or 
the polypeptide for purposes of alignment. Throughout this 
disclosure, whenever at least one of two sequences being 
compared is a degenerate oligonucleotide comprising an 
ambiguous residue, the two sequences are identical if at least 
one of the alternative forms of the degenerate oligonucle- 
otide is identical to the sequence with which it is being 
compared. As an illustration, AYAAA is 100% identical to 
ATAAA, since AYAAA is a mixture of ATAAA and 
ACAAA. 

Preferred Glycoprotein B encoding sequences are those 
which over the corresponding region are at least 72% 
identical to SHMDA; more preferably they are at least 74% 
identical; more preferably they are at least about 77% 
identical; more preferably they are at least about 80% 
identical; more preferably they are at least about 85% 
identical; more preferably they are at least about 91% 
identical. Other preferred Glycoprotein B encoding 



sequences are those which over the corresponding region are 
at least 71% identical to CFSSB; more preferably they are at 
least 73% identical; more preferably they are at least about 
77% identical; more preferably they are at least about 80% 
identical; more preferably they are at least about 85% 
identical. Other preferred Glycoprotein B encoding 
sequences are those which over the corresponding region are 
at least 70% identical to ENTFA; more preferably they are 
at least 72% identical; more preferably they are at least about 
75% identical; more preferably they are at least about 80% 
identical; more preferably they are at least about 85% 
identical; even more preferably, they are at least about 89% 
identical. Other preferred Glycoprotein B encoding 
sequences are those which over the corresponding region are 
at least about 78% identical to DNIQB; more preferably they 
are at least 80% identical; more preferably they are at least 
about 85% identical; more preferably they are at least about 
91% identical. Also included are Glycoprotein B encoding 
regions that are upstream or downstream of a region fulfill- 
ing the identity criteria indicated. 

Glycoprotein B encoding sequences from members of the 
RFHV/KSHV subfamily identified by any of the aforemen- 
tioned sequence comparisons, using either RFHV or KSHV 
sequences, or the subfamily-specific oligonucleotides, are 
equally preferred. Exemplary sequences are the Glycopro- 
tein B encoding sequences of RFHV and KSHV. Also 
embodied in this invention are fragments of any Glycopro- 
tein B encoding sequences of the subfamily, and longer 
polynucleotides comprising such polynucleotide fragments. 
The polynucleotide sequences described in this section 
30 provide a basis for obtaining the synthetic oligonucleotides, 
proteins and antibodies outlined in the sections that follow. 
These compounds may be prepared by standard techniques 
known to a practitioner of ordinary skill in the art, and may 
be used for a number of investigative, diagnostic, and 
therapeutic purposes, as described below. 
Preparation of polynucleotides 

Polynucleotides and oligonucleotides of this invention 
may be prepared by any suitable method known in the art. 
For example, oligonucleotide primers can be used in a PCR 
amplification of DNA obtained from herpes virus infected 
tissue, as in Example 3 and Example 11, described below. 
Alternatively, oligonucleotides can be used to identify suit- 
able bacterial clones of a DNA library, as described below in 
Example 8. 

Polynucleotides may also be prepared directly from the 
sequence provided herein by chemical synthesis. Several 
methods of synthesis are known in the art, including the 
triester method and the phosphite method. In a preferred 
method, polynucleotides are prepared by solid -phase syn- 
thesis using mononucleoside phosphoramidite coupling 
units. See, for example Horise et al., Beaucage et al., Kumar 
et al., and U.S. Pat. No. 4,415,732. 

A typical solid-phase synthesis involves reiterating four 
steps: deprotection, coupling, capping, and oxidation. This 
55 results in the stepwise synthesis of an oligonucleotide in the 
3' to 5' direction. 

In the first step, the growing oligonucleotide, which is 
attached at the 3' -end via a ( — O — ) group to a solid support, 
is deprotected at the 5' end. For example, the 5' end may be 
60 protected by a — ODMT group, formed by reacting with 
4,4'-dimethoxytrityl chloride (DMT-C1) in pyridine. This 
group is stable under basic conditions, but is easily removed 
under acid conditions, for example, in the presence of 
dichloro acetic acid (DCA) or trichloroacetic acid (TCA). 
65 Deprotection provides a 5' -OH reactive group. 

In the second step, the oligonucleotide is reacted with the 
desired nucleotide monomer, which itself has first been 
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converted to a 5' -protected, 3' -phosphor amidite. The 5'-OH 
of the monomer may be protected, for example, in the form 
of a — ODMT group, and the 3'-OH group may be converted 
to a phosphoramidite, such as — OP(OR')NR 2 ; where R is 
the isopropyl group — CH(CH 3 ) 2 ; and R' is, for example, 5 
— H (yielding a phosphoramidite diester), or — CH 3 , 
— CH 2 CH 3 , or the beta-cyanoethyl group — CH 2 CH 2 CN 
(yielding a phosphoramidite triester). The 
3 , -phosphoramidite group of the monomer reacts with the 
5 f -OH group of the growing oligonucleotide to yield the 
phosphite linkage 5 , -OP(OR')0-3'. 

In the third step, oligonucleotides that have not coupled 
with the monomer are withdrawn from further synthesis to 
prevent the formation of incomplete polymers. This is 
achieved by capping the remaining 5'-OH groups, for 15 
example, in the form of acetates ( — OC(0)CH 3 ) by reaction 
with acetic anhydride (CH 3 C(0)— O— C(0)CH 3 ). 

In the fourth step, the newly formed phosphite group (i.e., 
5 , -OP(OR , )0-3') is oxidized to a phosphate group (i.e., 
5 , -OP(=0)(OR , )0-3'); for example, by reaction with aque- 20 
ous iodine and pyridine. 

The four-step process may then be reiterated, since the 
oligonucleotide obtained at the end of the process is 
S'-protected and is ready for use in step one. When the 
desired full-length oligonucleotide has been obtained, it may 25 
be cleaved from the solid support, for example, by treatment 
with alkali and heat. This step may also serve to convert 
phosphate triesters (i.e., when R' is not — H) to the phos- 
phate diesters ( — 0P(=0) 2 0 — ), and to deprotect base- 30 
labile protected amino groups of the nucleotide bases. 

Polynucleotides prepared by any of these methods can be 
replicated to provide a larger supply by any standard 
technique, such as PCR amplification or gene cloning. 
Cloning and expression vectors comprising a Glycoprotein 35 
B encoding polynucleotide 

Cloning vectors and expression vectors are provided in 
this invention that comprise a sequence encoding a herpes 
virus Glycoprotein B or variant or fragment thereof. Suitable ^ 
cloning vectors may be constructed according to standard 
techniques, or may be selected from the large number of 
cloning vectors available in the art. While the cloning vector 
selected may vary according to the host cell intended to be 
used, useful cloning vectors will generally have the ability 45 
to self -replicate, may possess a single target for a particular 
restriction endonuclease, and may carry genes for a marker 
that can be used in selecting transfected clones. Suitable 
examples include plasmids and bacterial viruses; e.g., 
pUC18, mpl8, mpl9, pBR322, pMB9, ColEl, pCRl, RP4, 50 
phage DNAs, and shuttle vectors like pSA3 and pAT28. 

Expression vectors generally are replicable polynucle- 
otide constructs that encode a polypeptide operative ly linked 
to suitable transcriptional and translational controlling ele- 55 
ments. Examples of transcriptional controlling elements are 
promoters, enhancers, transcription initiation sites, and tran- 
scription termination sites. Examples of translational con- 
trolling elements are ribosome binding sites, translation 
initiation sites, and stop codons. Protein processing elements 60 
may also be included: for example, regions that encode 
leader or signal peptides and protease cleavage sites required 
for translocation of the polypeptide across the membrane or 
secretion from the cell. The elements employed would be 65 
functional in the host cell used for expression. The control- 
ling elements may be derived from the same Glycoprotein B 
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gene used in the vector, or they may be heterologous (i.e., 
derived from other genes and/or other organisms). 

Polynucleotides may be inserted into host cells by any 
means known in the art. Suitable host cells include bacterial 
cells such slsE. coli, mycobacteria, other prokaryotic micro- 
organisms and eukaryotic cells (including fungal cells, 
insect cells, plant cells, and animal cells). The cells are 
transformed by inserting the exogenous polynucleotide by 
direct uptake, endocytosis, transfection, f -mating, or elec- 
troporation. Subsequently, the exogenous polynucleotide 
may be maintained within the cell as a non-integrated vector, 
such as a plasmid, or may alternatively be integrated into the 
host cell genome. 

Cloning vectors may be used to obtain replicate copies of 
the polynucleotides they contain, or as a means of storing the 
polynucleotides in a depository for future recovery. Expres- 
sion vectors and host cells may be used to obtain polypep- 
tides transcribed by the polynucleotides they contain. They 
may also be used in assays where it is desirable to have intact 
cells capable of synthesizing the polypeptide, such as in a 
drug screening assay. 

Synthetic Type 1 oligonucleotides for Glycoprotein B of 
gamma herpes virus 

Oligonucleotides designed from sequences of herpes 
virus Glycoprotein B, as embodied in this invention, can be 
used as probes to identify related sequences, or as primers in 
an amplification reaction such as a PCR. 

Different oligonucleotides with different properties are 
described in the sections that follow. Oligonucleotides des- 
ignated as Type 1 are designed from previously known 
gamma herpes virus Glycoprotein B polynucleotide 
sequences. They are designed to hybridize with polynucle- 
otides encoding any gamma herpes virus Glycoprotein B, 
and may be used to detect previously known species of 
gamma herpes virus. They may also be used to detect and 
characterize new species of gamma herpes virus. Oligo- 
nucleotides designated as Type 2 are designed from the 
RFHV and KSHV Glycoprotein B polynucleotide sequences 
together. They are designed to hybridize with polynucle- 
otides encoding Glycoprotein B of the RFHV/KSHV 
subfamily, including but not limited to RFHV and KSHV. 
Oligonucleotides designated as Type 3 are designed from 
RFHV or KSHV Glycoprotein sequences that are relatively 
unique to the individual virus. They are designed to hybrid- 
ize specifically with polynucleotides encoding Glycoprotein 
B only from RFHV or KSHV and closely related viral 
strains. 

Some preferred examples of Type 1 oligonucleotides are 
listed in Table 4. These oligonucleotides have a specificity 
for Glycoprotein B encoding polynucleotides of a broad 
range of herpes viruses. 
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TABLE 4 

Type 1 Oligonucleotides used for Detecting, Amplifying, or Characterizing Herpes 
Virus Polynucleotides encoding Glycoprotein B 
Target: Herpes Glycoprotein B, especially from gamma Herpes Viruses 

Desig- Sequence No. of Orien- SEQ 



nation 


(5' to 3') 


Length 


forms 


tation 


ID: 


FRFDA 


GCTGTTCAGATTTGACTTAGAYMANMCNTGYCC 


33 


256 


sense 


13 


NIVPA 


GTGTACAAGAAGAACATCGTGCCNTAYATNTTYA 


32 


64 


sense 


14 


NIVPASQ 


A 

GTGTACAAGAAGAACATCGTGCC 


23 


1 




15 


TVNCB 


AACATGTCTACAATCTCACARTTNACNGTNGT 


32 


128 


anti- 


16 










sense 




TVNCBSQ 


AACATGTCTACAATCTCACA 


20 


1 




17 


FAYDA 


AATAACCTCTTTACGGCCCAAATTCARTWYGCN 


38 


64 


sense 


18 




TAYGA 










IYGKA 


CCAACGAGTGTGATGTCAGCCATTTAYGGNAAR 


38 


64 


sense 


19 




CCNGT 










IYGKASQ 


CCAACGAGTGTGATGTCAGCC 


21 


1 




20 


CYSRA 


TGCTACTCGCGACCTCTAGTCACCTTYAARTTYR 


38 


64 


sense 


21 




TNAA 










CYSRASQ 


TGOTA CTCGCG A CCTCTA GTC A CC 


24 


1 




22 


NIDFB 


ACCGGAGTACAGTTCCACTGTYTTRAARTCDATR 


36 


48 


anti- 


23 




TT 






sense 




NIDFBSQ 


TGTCACCTTGACATGAGGCCA 


21 


1 




24 


FREYA 


TTTGACCTGGAGACTATGTTYMGNGARTAYAA 


32 


64 


sense 


25 


FREYB 


GCTCTGGGTGTAGTAGTTRTAYTCYCTRAACAT 


33 


16 


anti- 


26 










sense 




NVFDB 


TCTCGGAACATGCTCTCCAGRTCRAAMACRTT 


32 


32 


anti- 


27 










sense 




GGMA 


ACCTTCATCAAAAATCCCTTNGGNGGNATGYT 


32 


128 


sense 


28 


TVNCA 


TGGACTTACAGGACTCGAACNACNGTNAAYTG 


32 


128 


sense 


29 



The orientation indicated in Table 4 is relative to the 
encoding region of the polynucleotide. Oligomers with a 
"sense" orientation will hybridize to the strand antisense to 
the coding strand and initiate amplification in the direction 
of the coding sequence. Oligomers with an "antisense" 
orientation will hybridize to the coding strand and initiate 
amplification in the direction opposite to the coding 
sequence. 

These oligonucleotides have been designed with several 
properties in mind: 1) sensitivity for target DNA even when 
present in the source material at very low copy numbers; 2) 
sufficient specificity to avoid hybridizing with unwanted 
sequences; for example, host sequences with limited simi- 
larity; 3) sufficient cross-reactivity so that differences 
between an unknown target and the sequence used to design 
it do not prevent the oligonucleotide from forming a stable 
duplex with the target. 

For some applications, a particularly effective design is 
oligonucleotides that have a degenerate segment at the 3' 
end, designed from a region of at least 2 known polynucle- 
otides believed to be somewhat conserved with the poly- 
nucleotide target. The various permutations of the ambigu- 
ous residues help ensure that at least one of the alternative 
forms of the oligonucleotide will be able to hybridize with 
the target. Adjacent to the degenerate segment at the 5' end 
of the oligonucleotide is a consensus segment which 
strengthens any duplex which may form and permits hybrid- 
ization or amplification reactions to be done at higher 
temperatures. The degenerate segment is located at the 3' 
end of the molecule to increase the likelihood of a close 
match between the oligonucleotide and the target at the site 
where elongation begins during a polymerase chain reaction. 

The ambiguous residues in the degenerate part of the 
sequences are indicated according to the following code: 



TABLE 5 



Single Letter Codes for Ambiguous 
Positions 
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Code 


Represents 


R 


A or G (purine) 


Y 


C or T (pyrimidine) 


W 


A or T 


S 


C or G 


M 


A or C 


K 


G or T 


B 


C or G or T (not A) 


D 


A or G or T (not C) 


H 


A or C or T (not G) 


V 


A or C or G (not T) 


N 


A or C or G or T 



The Type 1 oligonucleotides shown in Table 4 are gen- 

50 erally useful for hybridizing with Glycoprotein B encoding 
polynucleotide segments. This may be conducted to detect 
the presence of the polynucleotide, or to prime an amplifi- 
cation reaction so that the polynucleotide can be character- 
ized further. Suitable targets include polynucleotides encod- 

55 ing a region of a Glycoprotein B from a wide spectrum of 
gamma herpes viruses, including members of the RFHV/ 
KSHV subfamily. Suitable are those infecting any vertebrate 
animal, including humans and non-human primates, whether 
or not the Glycoprotein B or the virus has been previously 

60 known or described. Non-limiting examples include poly- 
nucleotides encoding Glycoprotein B from any of the 
gamma herpes viruses listed in Table 1 . 

The oligonucleotides may be used, inter alia, to prime a 
reaction to amplify a region of the target polynucleotide in 

65 the 3' direction from the site where the oligonucleotide 
hybridizes. FRFDA, HIVPA, TVNCB, FAYDA, IYGKA, 
CYSRA, NIDFB, FREYA, FREYB, NVFDB, GGMA, and 
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TVNCA are oligonucleotides with a consensus segment 
adjoining a degenerate segment, and are useful for this 
purpose. 

FIG. 2 shows the position along the Glycoprotein B 
polynucleotide sequence of the RFHV/KSHV subfamily 5 
where the aforementioned oligonucleotide primers are 
expected to hybridize. The map is not drawn to scale, but 
accurately depicts the order of the predicted hybridization 
sites in the 5' to 3' direction along the Glycoprotein B 
encoding strand. Also indicated are approximate lengths of 10 
amplification products that may be generated by using 
various sets of primers in an amplification reaction. The 
lengths shown include the primer binding sites at each end, 
and the polynucleotide encompassed between them. 

A preferred source of DNA for use as a target for the 15 
oligonucleotides of Table 4 is any biological sample 
(including solid tissue and tissue cultures), particularly of 
vertebrate animal origin, known or suspected to harbor a 
herpes virus. DNA is extracted from the source by any 
method known in the art, including extraction with organic 20 
solvents or precipitation at high salt concentration. 

A preferred method of amplification is a polymerase chain 
reaction: see generally U.S. Pat. No. 4,683,195 (Mullis) and 
U.S. Pat. No. 4,683,202 (Mullis et al.); see U.S. Pat. No. 
5,176,995 (Sninsky et al.) for application to viral polynucle- 25 
otides. An amplification reaction may be conducted by 
combining the target polynucleotide to be amplified with 
short oligonucleotides capable of hybridizing with the target 
and acting as a primer for the polymerization reaction. Also 
added are substrate mononucleotides and a heat-stable 30 
DNA-dependent Glycoprotein B, such as Taq. The condi- 
tions used for amplification reactions are generally known in 
the art, and can be optimized empirically using sources of 
known viruses, such RFHV, KSHV, hEBV or HSV1. Con- 
ditions can be altered, for example, by changing the time and 35 
temperature of the amplification cycle, particularly the 
hybridization phase; changing the molarity of the oligo- 
nucleotide primers; changing the buffer composition; and 
changing the number of amplification cycles. Fine-tuning 
the amplification conditions is a routine matter for a prac- 40 
titioner of ordinary skill in the art. 

In one method, a single primer of this invention is used in 
the amplification, optionally using a second primer, such as 
a random primer, to initiate replication downstream from the 
first primer and in the opposite direction. In a preferred 45 
method, at least two of the primers of this invention are used 
in the same reaction to initiate replication in opposite 
directions. The use of at least two specific primers enhances 
the specificity of the amplification reaction, and defines the 
size of the fragment for comparison between samples. For 50 
example, amplification may be performed using primers 
NIVPA and TVNCB. More preferred is the use of several 
sets of primers in a nested fashion to enhance the amplifi- 
cation. Nesting is accomplished by performing a first ampli- 
fication using primers that generate an intermediate product, 55 
comprising one or more internal binding sites for additional 
primers. This is followed by a second amplification, using a 
new primer in conjunction with one from the previous set, 
or two new primers. The second amplification product is 
therefore a subfragment of the first product. If desired, 60 
additional rounds of amplification can be performed using 
additional primers. 

Accordingly, nesting amplification reactions can be per- 
formed with any combination of three or more oligonucle- 
otide primers comprising at least one primer with a sense 65 
orientation and one primer with an antisense orientation. 
Preferably, primers are chosen so that intermediate ampli- 
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fication products are no more than about 2000 base pairs; 
more preferably, they are no more than about 1500 base 
pairs; even more preferably, they are no more than about 750 
base pairs. Preferably, the innermost primers provide a final 
amplification product of no more than about 1200 base pairs; 
more preferably, they are no more than about 750 base pairs; 
even more preferably, they are no more than about 500 base 
pairs. Accordingly, a preferred combination is at least three 
primers selected from FAYDA, IYGKA, CYSRA, NIDFB, 
NVFDB, and FREYB. Another preferred combination is at 
least three primers selected from FRFDA, NIVPA, TVNCA, 
NIDFB, NVFDB, and FREYB. 

Particularly preferred is a first amplification using primer 
FRFDA and TVNCB, followed by a second amplification 
using primer NIVPA and TVNCB. When performed on a 
polynucleotide from a Glycoprotein B gene of KSHV, the 
size of the final fragment including the primer binding 
regions is about 386 bases. 

The amplified polynucleotides can be characterized at any 
stage during the amplification reaction, for example, by size 
determination. Preferably, this is performed by running the 
polynucleotide on a gel of about 1-2% agarose. If present in 
sufficient quantity, the polynucleotide in the gel can be 
stained with ethidium bromide and detected under ultravio- 
let light. Alternatively, the polynucleotide can be labeled 
with a radioisotope such as 32 P or 35 S before loading on a gel 
of about 6% p oly aery 1 amide, and the gel can subsequently 
be used to produce an autoradiogram. A preferred method of 
labeling the amplified polynucleotide is to end-label an 
oligonucleotide primer such as NIVPA with 32 P using a 
polynucleotide kinase and gamma-[ 32 P]-ATP, and continu- 
ing amplification for about 5—15 cycles. 

If desired, size separation may also be used as a step in the 
preparation of the amplified polynucleotide. This is particu- 
larly useful when the amplification mixture is found to 
contain artifact polynucleotides of different size, such as 
may have arisen through cross-reactivity with undesired 
targets. A separating gel, such as described in the preceding 
paragraph, is dried onto a paper backing and used to produce 
an autoradiogram. Positions of the gel corresponding to the 
desired bands on the autoradiogram are cut out and extracted 
by standard techniques. The extracted polynucleotide can 
then be characterized directly, cloned, or used for a further 
round of amplification. 

The oligonucleotides NIVPASQ, TVNCBSQ, 
IYGKASQ, CYSRASQ, and NIDFBSQ are each derived 
from a consensus-degenerate Type 1 oligonucleotide. They 
retain the consensus segment, but lack the degenerate seg- 
ment. They are useful, inter alia, in sequencing of a Glyco- 
protein B polynucleotide fragment successfully amplified 
using a consensus-degenerate oligonucleotide. 

Unwanted polynucleotides in a mixture from an amplifi- 
cation reaction can also be proportionally reduced by shift- 
ing to primers of this type. For example, an initial 3—5 cycles 
of amplification can be conducted using primers NIVPA and 
TVNCB at Vs to Vis the normal amount. Then a molar excess 
(for example, 50 pmol) of NIVPASQ and/or TVNCBSQ is 
added, and the amplification is continued for an additional 
30—35 cycles. This reduces the complexity of the oligonucle- 
otides present in the amplification mixture, and permits the 
reaction temperatures to be increased to reduce amplification 
of unwanted polynucleotides. 

Type 2 oligonucleotide primers for Glycoprotein B of the 
RFHV/KSHV subfamily 

Type 2 oligonucleotides are intended for detection or 
amplification reactions for the Glycoprotein B of any virus 
of the RFHV/KSHV subfamily. They are designed from 
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segments of the Glycoprotein B encoding region that are 
relatively well conserved between RFHV and KSHV, but not 
other previously sequenced gamma herpes viruses. Preferred 
examples are shown in Table 6: 
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optionally preamplified using Type 1 oligonucleotides such 
as NIVPA and TVNCB. Other combinations are also suit- 
able. In another example, one of the Type 2 oligonucleotides 
of Table 6 is used in combination with a suitable Type 1 



TABLE 6 



Type 2 Oligonucleotides used for Detectings Amplifying, or Characterizing Herpes 
Virus Polynucleotides encoding Glycoprotein B 
Target: Glycoprotein B from the RFHV/KSHV subfamily of herpes viruses 



Desig- 


Sequence 




No. of 


Orien- 


SEQ 


nation 


(5' to 3') 


Length 


forms 


tation 


ID: 


SHMDA 


AGACCCGTGCCACTCTATGARATHAGYCAYAT 


35 


24 


sense 


41 




GGA 










SHMDASQ 


AGACCCGTGCCACTCTATGA 


20 


1 




42 


CFSSB 


GTTCACAACAATCTTCATNGARCTRAARCA 


30 


32 


anti- 


43 










sense 




CFSSBSQ 


GTTCACAACAATCTTCAT 


IS 


1 




44 


ENTFA 


GTCAACGGAGTAGARAAYACNTTYACNGA 


29 


128 


sense 


45 


DNIQB 


ACTGGCTGGCTAAAGTACCTTTGAATRTTRTC 


35 


16 


anti- 


46 




NGT 






sense 




DNIQBSQ 


ACTGGCTGGCTAAAGTACCTTTG 


23 


1 




47 



Type 2 oligonucleotides may be used for many purposes 
where specificity for the RFHV/KSHV subfamily specificity 25 
is desired. This includes the detection or amplification of 
Glycoprotein B from known viruses of the RFHV/KSHV 
subfamily, or characterization of Glycoprotein B from new 
members of the family. 

SHMDA, CFSSB, ENTFA, and DNIQB are consensus- 30 
degenerate oligonucleotides with a degenerate 3' end, useful 
as initial primers for PCR amplifications, including poly- 
nucleotides of the RFHV/KSHV subfamily which are not 
identical to either RFHV or KSHV. SHMDASQ, CFSSBSQ, 
and DNIQBSQ contain only a consensus segment, and are 35 
useful for example in labeling or sequencing polynucle- 
otides already amplified using the consensus-degenerate 
oligonucleotides. 

In one application, these Type 2 oligonucleotides are used 
individually or in combination as amplification primers. In 
one example of this application, the oligonucleotides are 40 
used directly on DNA obtained from a tissue sample to 
obtain a Glycoprotein B from the RFHV/KSHV subfamily, 
but not more distantly related viruses that may be present in 
the same tissue, such as hEBV, hCMV or HSV1. Thus, 
SHMDA and DNIQB may be used as primers in a PCR, 



oligonucleotide listed earlier. Thus, NIVPA may be used in 
combination with DNIQB, or SHMDA may be used in 
combination with TVNCB as primers in a PCR. The DNA 
source may optionally be preamplified using NIVPA and 
TVNCB. Other combinations are also suitable. 

In another application, Type 2 oligonucleotides, or oligo- 
nucleotides comprising these sequences or fragments 
thereof, are used as probes in a detection assay. For example, 
they can be provided with a suitable label such as 32 P, and 
then used in a hybridization assay with a suitable target, such 
as DNA amplified using FRFDA and/or NIVPA, along with 
TVNCB. 

Type 3 oligonucleotide primers specific for Glycoprotein B 
of RFHV or KSHV 

Type 3 oligonucleotides are intended for detection or 
amplification reactions specific for a particular virus. They 
are non-degenerate segments of the Glycoprotein B encod- 
ing region of RFHV or KSHV that are relatively more 
variable between these two viruses and against other herpes 
viruses than are other segments of the region. Preferred 
examples are shown in Table 7, and in the Example section. 



TABLE 7 



Type 3 Oligonucleotides used for Detecting, Amplifying, or Characterizing 
Herpes Virus Polynucleotides encoding Glycoprotein B 



Desig- Sequence No. of SEQ 



nation 


(5' to 3') 


Length 


forms 


Orien-tation 


ID: 




Target: Glycoprotein B from RFHV 






GMTEB 


TGCTGCTTCTGTCATACCGCG 


21 


1 


anti-sense 


48 


AAITB 


TATTTGTTTGTGATTGCTGCT 


21 


1 


anti-sense 


49 


GMTEA 


GCGGTATGACAGMGCAGCAA 


21 


1 


sense 


50 


KYEIA 


AACAAATATGAGATCCCCAGG 


21 


1 


sense 


51 


TDRDB 


TCATCCCGATCGGTGAACGTA 


21 


1 


anti-sense 


52 


VEGLB 


TTGTCAGTTAGACCTTCGACG 


21 


1 


anti-sense 


53 


VEGLA 


CCCGTCGAAGGTCTAACTGAC 


21 


1 


sense 


54 


PVLYA 


AGCCAACCAGTACTGTACTCT 


21 


1 


sense 


55 




Target: Glycoprotein B from KSHV 






GLTEB 


TGATGGCGGACTCTGTCAAGC 


21 


1 


anti-sense 


56 


TNKYB 


GTTCATACTTGTTGGTGATGG 


21 


1 


anti-sense 


57 


GLTEA 


GGGCTTGACAGAGTCCGCCAT 


21 


1 


sense 


58 



35 
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TABLE 7- continued 



Type 3 Oligonucleotides used for Detecting, Amplifying, or Characterizing 



Herpes Virus Polynucleotides encoding Glycoprotein B 


Desig- 


Sequence 




No. of 




SEQ 


nation 


(5' to 3') 


Length 


forms 


Orien-tation 


ID: 


YELPA 


ACAAGTATGAACTCCCGAGAC 


21 


1 


sense 


59 


VNVNB 


ACCCCGTTGACATTTACCTTC 


21 


1 


anti-sense 


60 


TFTDV 


TCGTCTCTGTCAGTAAATGTG 


21 


1 


anti-sense 


61 


TVFLA 


CCACAGTATTCCTCCAACCAG 


21 


1 


sense 


62 


SQPVA 


GGTACTTTAGCCAGCCGGTCA 


21 


1 


sense 


63 



GMTEB, AAITB, GMTEA, KYEIA, TDRDB, VEGLB, 15 
VEGLA, and PVLYA are specific non-degenerate oligo- 
nucleotides for the RFHV Glycoprotein B, and can be used 
for the amplification or detection of Glycoprotein B encod- 
ing polynucleotides of RFHV origin. Amplification is pref- 
erably done using the oligonucleotides in a nested fashion: 2Q 
e.g., a first amplification is conducted using GMTEA and 
VEGLB as primers; then a second amplification is con- 
ducted using KYEIA and TDRDB as primers. This provides 
an extremely sensitive amplification assay that is specific for 
RFHV Glycoprotein B. GMTEB and AAITB hybridize near 
the 5' end of the fragment, and may be used in combination 25 
with up-stream hybridizing Type 1 oligonucleotides to 
amplify or detect sequences in the 5' direction. VEGLA and 
PVLYA hybridize near the 3' end of the fragment, and may 
be used in combination with down-stream hybridizing Type 
1 oligonucleotides to amplify or detect sequences in the 3' 30 
direction. 

Similarly, GLTEB, TNKYB, GLTEA, YELPA, VNVNB, 
ENTFB, SQPVA, and TVFLA are specific non-degenerate 
oligonucleotides for the KSHV Glycoprotein B, and can be 
used in a similar fashion, including as primers for an 35 
amplification reaction. Preferably, the amplification is done 
using the oligonucleotides in a nested fashion: e.g., a first 
amplification is conducted using GLTEA and ENTFB as 
primers; then a second amplification is conducted using 
YELPA and VNVNB as primers. This provides an extremely 40 
sensitive amplification assay that is specific for KSHV 
Glycoprotein B. GLTEB and TNKYB hybridize near the 5' 
end of the fragment, and may be used in combination with 
up-stream hybridizing Type 1 oligonucleotides to amplify or 
detect sequences in the 5' direction. SQPVA and TVFLA 45 
hybridize near the 3' end of the fragment, and may be used 
in combination with down-stream hybridizing Type 1 oli- 
gonucleotides to amplify or detect sequences in the 3' 
direction. 

Practitioners skilled in the art will immediately recognize 50 
that oligonucleotides of Types 1 , 2 and 3 (in particular, those 
shown in Tables 4, 6 and 7) can be used in combination with 
each other in a PCR to amplify different sections of a 
Glycoprotein B encoding polynucleotide. The specificity of 
the amplification reaction generally is determined by the 55 
primer with the least amount of cross reactivity. The size and 
location of the amplified fragment is determined by the 
primers used in the final round of amplification. For 
example, NIVPA used in combination with SQPVB will 
amplify about 310 bases of Glycoprotein B encoding poly- 60 
nucleotide from a virus closely related to KSHV Suitable 
combinations of oligonucleotides may be used as amplifi- 
cation primers in a nested fashion. 

Use of synthetic oligonucleotides to characterize polynucle- 
otide targets 65 

As described in the previous section, the oligonucleotides 
embodied in this invention, can be used as primers for 



amplification of polynucleotides encoding a herpes virus 
Glycoprotein B, particularly in a polymerase chain reaction. 

The conditions for conducting the PCR depend on the 
nature of the oligonucleotide being used. In particular, when 
using oligonucleotides comprising a degenerate segment, or 
a consensus segment that is only partly identical to the 
corresponding segment of the target, and when the target 
polynucleotide comprises an unknown sequence, the selec- 
tion of conditions may be important to the success of the 
amplification. Optimizing conditions for a new primer or 
new polynucleotide target are routine for a practitioner of 
ordinary skill. What follows is a guide to assist in that 
objective. 

First, the temperature of the annealing step of the PCR is 
optimized to increase the amount of target polynucleotide 
being amplified above the amount of unrelated polynucle- 
otide amplified. Ideally, the temperature permits the primers 
to hybridize with the target sequence but not with other 
sequences. For primers comprising a consensus segment 
(Type 1), the temperature of the annealing step in repeat 
cycles of a PCR is generally at least about 45° C; preferably 
it is at least about 50° C. It is also preferable to conduct the 
first few cycles of the PCR at even higher temperatures, such 
as 55° C. or even 60° C. The higher temperature will compel 
the annealing to be more sequence specific during the cycle 
and will reduce the background amplification of unrelated 
sequences. Annealing steps for subsequent cycles may be 
performed under slightly less stringent conditions to 
improve the rate of amplification. In an especially preferred 
procedure, the first PCR amplification cycle comprises an 
annealing step of about 1 min conducted at 60° C. Annealing 
steps in subsequent cycles are conducted at 2° C. lower each 
cycle, until a temperature of 50° C. is reached. Further 
cycles are then conducted with annealing steps at 50° C, 
until the desired degree of amplification is achieved. 

Primers which are virus-specific and do not contain a 
consensus segment (Type 3) are more selective, and may be 
effective over a broader temperature range. Preferred tem- 
peratures for the annealing step in PCR amplification cycles 
are between 50° C. and 65° C. 

Second, the buffer conditions are optimized. We have 
found that buffers supplied with commercial preparations of 
Taq polymerase are sometimes difficult to use, in part 
because of a critical dependence on the concentration of 
magnesium ion. PCRs performed using the oligonucleotides 
of this invention generally are more easily performed using 
a buffer such as that suggested by M. Wigler (Lisitsyn et al.). 
Preferably, the final PCR reaction mixture contains (NH 4 ) 
2 S0 4 instead of KC1 as the principal ion source. Preferably, 
the concentration of (NH 4 ) 2 S0 4 in the final reaction mixture 
is about 5—50 mM, more preferably about 10—30 mM, even 
more preferably 16 mM. The buffering component is pref- 
erably Tris, preferably at a final concentration of about 67 
mM and a pH of about 8.8. Under these conditions, the 
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MgCL concentration is less critical. Preferably the final 
concentration is about 1—10 mM, more preferably it is about 
3—6 mM, optimally it is about 4 mM. The reaction mixture 
may also contain about 10 mM B-mercaptoethanol and 
0.05—1 mg/mL bovine serum albumin. An especially pre- 5 
f erred buffer is WB4 buffer (67 mM Tris buffer pH 8.8, 4 
mM MgCl 2 , 16 mM (NH 4 ) 2 S0 4 , 10 mM P-mercaptoethanol 
and 0.1 mg/mL albumin. Preferred conditions for perform- 
ing the reaction are provided below in Example 3. 

To conduct the PCR reaction, a mixture comprising the 
oligonucleotide primers, the four deoxynucleotides, a suit- 
able buffer, the DNA to be amplified, and a heat-stable 
DNA-dependent DNA polymerase is prepared. The mixture 
is then processed through temperature cycles for the 
annealing, elongating, and melting steps until the desired 
degree of amplification is achieved. The amount of DNA 15 
produced can be determined, for example, by staining with 
ethidfum bromide, optionally after separating amplified 
fragments on an agarose gel. 

A possible complication of the amplification reaction is 
dimerization and amplification of the oligonucleotide prim- 20 
ers themselves. This can be easily detected as low molecular 
weight (<100 base pair) fragments on an agarose gel. 
Amplified primer can be removed by agarose or polyacry- 
lamide gel separation. The amount of amplified dimer may 
be reduced by minor adjustments to the conditions of the 2 5 
reaction, particularly the temperature of the annealing step. 
It is also preferable to pre-mix the primers, the 
deoxynucleotides, and the buffer, and heat the mixture to 80 
degrees before adding the DNA to be amplified. 

Amplification reactions using any the oligonucleotides of 3Q 
this invention as primers yield polynucleotide fragments 
encoding a portion of a Glycoprotein B. These fragments 
can be characterized by a number of techniques known to a 
practitioner of ordinary skill in the art. Some non-limiting 
methods for characterizing a fragment are as follows: 

In one method, a fragment may be sequenced according 35 
to any method of sequence determination known in the art, 
including the Maxam & Gilbert method, or the Sanger & 
Nicholson method. Alternatively, the fragment may be sub- 
mitted to any of the commercial organizations that provide 
a polynucleotide sequencing service. The fragment may 40 
optionally be cloned and/or amplified before sequencing. 
The nucleotide sequence can be used to predict the amino 
acid sequence encoded by the fragment. Sequence data can 
be used for comparison with other sequenced Glycoprotein 
B's, either at the polynucleotide level or the amino acid 45 
level, to identify the species of herpes virus present in the 
original source material. Sequence data can also be used in 
modeling algorithms to predict antigenic regions or three- 
dimensional structure. 

In a second method of characterizing, the size of the 50 
fragment can be determined by any suitable method, such as 
running on a polyacrylamide or agarose gel, or centrifuging 
through an appropriate density gradient. For example, for 
RFHV and KSHV, the fragment between NIVPA and 
TVNCB is about 319 bases. Hence, the length of the entire 55 
amplified fragment including primer binding regions is 
about 386 bases. The corresponding fragment of sHVl 
contains an additional 6 base pairs. The sHVl fragment can 
therefore be distinguished from that of RFHV or KSHV, for 
example, by running amplified polynucleotide fragments 60 
from each in neighboring lanes of a separating gel, or by 
running the sHVl fragment beside suitable molecular 
weight standards. Polynucleotide fragments identical in size 
to that of RFHV and KSHV may be from the same or a 
related viral species. Fragments substantially different in 65 
size are more likely to be derived from a different herpes 
virus. 
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In a third method of characterizing, a fragment can be 
tested by attempting to hybridize it with an oligonucleotide 
probe. In a preferred example, a fragment is tested for 
relatedness to the Glycoprotein B encoding region of RFHV 
or KSHV. The test is conducted using a probe comprising a 
sequence of a Glycoprotein B encoding region, or its genetic 
complement. Suitable probes are polynucleotides compris- 
ing sequences from RFHV or KSHV, such as the Type 3 
oligonucleotides listed in Table 7. 

The length and nature of the probe and the hybridization 
conditions are selected depending on the objectives of the 
test. If the objective is to detect only polynucleotides from 
RFHV or KSHV, including minor strain variants, then 
hybridization is performed under conditions of high strin- 
gency. A sequence from the respective Glycoprotein B is 
used. Longer length sequences improve the specificity of the 
test and can be used under conditions of higher stringency. 
Preferably, the probe will comprise a Glycoprotein B 
sequence of at least about 30 nucleotides; more preferably, 
the sequence will be at least about 50 nucleotides; even more 
preferably, the sequence will be at least about 75 nucleotides 
in length. 

If the objective is to detect polynucleotides that are 
closely related but not identical to RFHV or KSHV, such as 
in a screening test or a test to recruit previously undescribed 
viruses of the RFHV/KSHV subfamily, then different con- 
ditions are chosen. Sequences from RFHV or KSHV may be 
used, but a mixture of the two or a degenerate probe is 
generally preferred. The length of the sequence and the 
conditions of the hybridization reaction are selected to 
provide sufficient specificity to exclude unwanted 
sequences, but otherwise provide a maximum degree of 
cross-reactivity amongst potential targets. Suitable condi- 
tions can be predicted using the formulas given earlier, by 
calculating the T m and then calculating the corresponding 
temperature for the maximum degree of mismatch to be 
tolerated. The suitability of the conditions can be tested 
empirically by testing the cross -re activity of the probes with 
samples containing known target polynucleotides encoding 
herpes Glycoprotein B. 

The minimum degree of complementarity required for a 
stable duplex to form under the conditions of the assay will 
determine what Glycoprotein B sequences will hybridize 
with the probe. Consider, for example, a target obtained 
from a human or non-human primate, amplified to produce 
a fragment corresponding to bases 36—354 of SEQ. ID 
NO:3, and then probed with the corresponding fragment of 
the KSHV polynucleotide. According to the data in Table 2, 
if the hybridization reaction is performed under conditions 
that require only about 50% identity for a stable duplex to 
form, the probe may hybridize with targets from any of the 
sequenced gamma herpes Glycoprotein B genes, including 
hEBV and sHVl. If the reaction is performed under condi- 
tions that require at least about 65% identity between probe 
and target, preferably at least about 67% identity, more 
preferably at least about 70% identity, and even more 
preferably at least about 75% identity for a stable duplex to 
form, the assay will detect a target polynucleotide from the 
RFHV/KSHV subfamily; i.e., either RHFV, KSHV, or a 
closely related herpes virus with a Glycoprotein B poly- 
nucleotide not yet sequenced. Even under hybridization 
conditions that required only about 50—55% identity for a 
stable duplex to form, a positive reaction would not indicate 
the presence of bHV4, eHV2, or mHV68, since these viruses 
are not believed to be capable of infecting primates. 

It is possible to combine characterization by size and 
characterization by hybridization. For example, the ampli- 
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fied polynucleotide may be separated on a gel of acrylamide 
or agarose, blotted to a membrane of suitable material, such 
as nitrocellulose, and then hybridized with a probe with a 
suitable label, such as 32p. The presence of the label after 
washing reflects the presence of hybridizable material in the 5 
sample, while the migration distance compared with appro- 
priate molecular weight standards reflects the size of the 
material. A fragment sequence hybridizing with one of the 
aforementioned probes under conditions of high stringency 
but having an unexpected size would indicate a Glycopro- 10 
tein B sequence with a high degree of identity to the probe, 
but distinct from either RFHV or KSHV. 
Use of polynucleotides and oligonucleotides to detect herpes 
virus infection 

Polynucleotides encoding herpes virus Glycoprotein B, 15 
and synthetic oligonucleotides based thereupon, as embod- 
ied in this invention, are useful in the diagnosis of clinical 
conditions associated with herpes virus infection. For 
example, the presence of detectable herpes Glycoprotein B 
in a clinical sample may suggest that the respective herpes 20 
virus participated as an etio logic agent in the development of 
the condition. The presence of viral Glycoprotein B in a 
particular tissue, but not in surrounding tissue, may be useful 
in the localization of an infected lesion. Differentiating 
between gamma herpes virus and other herpes viruses in 25 
clinical samples may be useful in predicting the clinical 
course of an infection or selecting a drug suitable for 
treatment. Since Glycoprotein B is expressed by replicative 
virus, L-particles, and infected cells, we predict that it will 
serve as a useful marker for active and quiescent stages of 30 
the disease that involve expression of the protein in any of 
these forms. 

The procedures for conducting diagnostic tests are exten- 
sively known in the art, and are routine for a practitioner of 
ordinary skill. Generally, to perform a diagnostic method of 35 
this invention, one of the compositions of this invention is 
provided as a reagent to detect a target in a clinical sample 
with which it reacts. For example, a polynucleotide of this 
invention may be used as a reagent to detect a DNAor RNA 
target, such as might be present in a cell infected with a 40 
herpes virus. A polypeptide of this invention may be used as 
a reagent to detect a target with which it is capable of 
forming a specific complex, such as an antibody molecule or 
(if the polypeptide is a receptor) the corresponding ligand. 
An antibody of this invention may be used as a reagent to 45 
detect a target it specifically recognizes, such as a polypep- 
tide expressed by virally infected cells. 

The target is supplied by obtaining a suitable tissue 
sample from an individual for whom the diagnostic param- 
eter is to be measured. Relevant test samples are those 50 
obtained from individuals suspected of harboring a herpes 
virus. Many types of samples are suitable for this purpose, 
including those that are obtained near the suspected site of 
infection or pathology by biopsy or surgical dissection, in 
vitro cultures of cells derived therefrom, solubilized 55 
extracts, blood, and blood components. If desired, the target 
may be partially purified from the sample or amplified 
before the assay is conducted. The reaction is performed by 
contacting the reagent with the sample under conditions that 
will allow a complex to form between the reagent and the 60 
target. The reaction may be performed in solution, or on a 
solid tissue sample, for example, using histology sections. 
The formation of the complex is detected by a number of 
techniques known in the art. For example, the reagent may 
be supplied with a label and unreacted reagent may be 65 
removed from the complex; the amount of remaining label 
thereby indicating the amount of complex formed. Further 



details and alternatives for complex detection are provided 
in the descriptions that follow. 

To determine whether the amount of complex formed is 
representative of herpes infected or uninfected cells, the 
assay result is preferably compared with a similar assay 
conducted on a control sample. It is generally preferable to 
use a control sample which is from an uninfected source, and 
otherwise similar in composition to the clinical sample being 
tested. However, any control sample may be suitable pro- 
vided the relative amount of target in the control is known 
or can be used for comparative purposes. It is often prefer- 
able to conduct the assay on the test sample and the control 
sample simultaneously. However, if the amount of complex 
formed is quantifiable and sufficiently consistent, it is 
acceptable to assay the test sample and control sample on 
different days or in different laboratories. 

Accordingly, polynucleotides encoding Glycoprotein B of 
the RFHV/KSHV subfamily, and the synthetic oligonucle- 
otides embodied in this invention, can be used to detect 
gamma herpes virus polynucleotide that may be present in a 
biological sample. General methods for using polynucle- 
otides in specific diagnostic assays are well known in the art: 
see, e.g., Patent Application JP 5309000 (Iatron). 

An assay employing a polynucleotide reagent may be 
rendered specific, for example: 1) by performing a hybrid- 
ization reaction with a specific probe; 2) by performing an 
amplification with a specific primer, or 3) by a combination 
of the two. 

To perform an assay that is specific due to hybridization 
with a specific probe, a polynucleotide is chosen with the 
required degree of complementarity for the intended target. 
Preferred probes include polynucleotides of at least about 16 
nucleotides in length encoding a portion of the Glycoprotein 
B of RFHV, KSHV, or another member of the RFHV/KSHV 
subfamily. Increasingly preferred are probes comprising at 
least about 18, 21, 25, 30, 50, or 100 nucleotides of the 
Glycoprotein B encoding region. Also preferred are degen- 
erate probes capable of forming stable duplexes with poly- 
nucleotides of the RFHV/KSHV subfamily under the con- 
ditions used, but not polynucleotides of other herpes viruses. 

The probe is generally provided with a label. Some of the 
labels often used in this type of assay include radioisotopes 
such as 32 P and 33 P, chemiluminescent or fluorescent 
reagents such as fluorescein, and enzymes such as alkaline 
phosphatase that are capable of producing a colored solute 
or precipitant. The label may be intrinsic to the reagent, it 
may be attached by direct chemical linkage, or it may be 
connected through a series of intermediate reactive 
molecules, such as a biotin-avidin complex, or a series of 
inter-reactive polynucleotides. The label may be added to 
the reagent before hybridization with the target 
polynucleotide, or afterwards. To improve the sensitivity of 
the assay, it is often desirable to increase the signal ensuing 
from hybridization. This can be accomplished by using a 
combination of serially hybridizing polynucleotides or 
branched polynucleotides in such a way that multiple label 
components become incorporated into each complex. See 
U.S. Pat. No. 5,124,246 (Urdea et al.). 

If desired, the target polynucleotide may be extracted 
from the sample, and may also be partially purified. To 
measure viral particles, the preparation is preferably 
enriched for DNA; to measure active transcription of Gly- 
coprotein B, the preparation is preferably enriched for RNA. 
Generally, it is anticipated that the level of polynucleotide of 
a herpes virus will be low in clinical samples: there may be 
just a few copies of DNA encoding the Glycoprotein B per 
cell where the virus is latent, or up to several hundred copies 
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of DNA per cell where the virus is replicating. The level of 
mRNA will be higher in cells where the protein is actively 
expressed than those where the gene is inactive. It may 
therefore be desirable to enhance the level of target in the 
sample by amplifying the DNA or RNA. A suitable method 5 
of amplification is a PCR, which is preferably conducted 
using one or more of the oligonucleotide primers embodied 
in this invention. RNA may be amplified by making a cDNA 
copy using a reverse transcriptase, and then conducting a 
PCR using the aforementioned primers. 10 

The target polynucleotide can be optionally subjected to 
any combination of additional treatments, including diges- 
tion with restriction endo nucleases, size separation, for 
example by electrophoresis in agarose or p oly aery 1 amide, 
and affixation to a reaction matrix, such as a blotting 15 
material. 

Hybridization is allowed to occur by mixing the reagent 
polynucleotide with a sample suspected of containing a 
target polynucleotide under appropriate reaction conditions. 
This may be followed by washing or separation to remove 20 
unre acted reagent. Generally, both the target polynucleotide 
and the reagent must be at least partly equilibrated into the 
single-stranded form in order for complementary sequences 
to hybridize efficiently. Thus, it may be useful (particularly 
in tests for DNA) to prepare the sample by standard dena- 25 
turation techniques known in the art. 

The level of stringency chosen for the hybridization 
conditions depends on the objective of the test. If it is desired 
that the test be specific for RFHV or KSHV, then a probe 
comprising a segment of the respective Glycoprotein B is 30 
used, and the reaction is conducted under conditions of high 
stringency. For example, a preferred set of conditions for use 
with a preferred probe of 50 nucleotides or more is 6xSSC 
at 37° C. in 50% formamide, followed by a wash at low ionic 
strength. This will generally require the target to be at least 35 
about 90% identical with the polynucleotide probe for a 
stable duplex to form. The specificity of the reaction for the 
particular virus in question can also be increased by increas- 
ing the length of the probe used. Thus, longer probes are 
particularly preferred for this application of the invention. 40 
Alternatively, if it is desired that the test also be able to 
detect other herpes viruses related to KSIIV, then a lower 
stringency is used. Suitable probes include fragments from 
the KSHV Glycoprotein B polynucleotide, a mixture 
thereof, or oligonucleotides such as those listed in Table 7. 45 

Appropriate hybridization conditions are determined to 
permit hybridization of the probe only to Glycoprotein B 
sequences that have the desired degree of identity with the 
probe. The stringency required depends on the length of the 
polynucleotide probe, and the degree of identity between the 50 
probe and the desired target sequence. Consider, for 
example, a probe consisting of the KSHV polynucleotide 
fragment between the hybridization sites of NIVPA and 
TVNCB. Conditions requiring a minimum identity of 60% 
would result in a stable duplex formed with a corresponding 55 
polynucleotide of KSHV and other gamma herpes viruses 
such as sHVl; conditions requiring a minimum identity of 
90% would result in a stable duplex forming only with a 
polynucleotide from KSHV and closely related variants. 
Conditions of intermediate stringency requiring a minimum 60 
identity of 65—70% would permit duplexes to form with a 
Glycoprotein B polynucleotide of KSHV, and some other 
members of the RFHV/KSHV subfamily, but not with 
corresponding polynucleotides of other known herpes 
viruses, including gamma herpes viruses eHV2, sHVl, 65 
mHV68, bHV4, EBV, and other human pathogens such as 
hCMV, hHV6, hVZV, and HSV1. 
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Conditions can be estimated beforehand using the formula 
given earlier. Preferably, the exact conditions are confirmed 
by testing the probe with separate samples known to contain 
polynucleotides, both those desired to be detected and those 
desired to go undetected in the assay. Such samples may be 
provided either by synthesizing the polynucleotides from 
published sequences, or by extracting and amplifying DNA 
from tissues believed to be infected with the respective 
herpes virus. Determining hybridization conditions is a 
matter of routine adjustment for a practitioner of ordinary 
skill, and does not require undue experimentation. Since 
eHV2, sHVl, mHV68, bHV4 and EBV are more closely 
identical to the RFHV/KSHV subfamily than alpha and beta 
herpes viruses, conditions that exclude gamma herpes 
viruses outside the RFHV/KSHV subfamily will generally 
also exclude the other herpes viruses listed in Table 1. In 
addition, if it is believed that certain viruses will not be 
present in the sample to be tested in the ultimate determi- 
nation (such as eHV2, mHV68 or bHV4 in a human tissue 
sample), then the corresponding target sequences may 
optionally be omitted when working out the conditions of 
the assay. Thus, conditions can be determined that would 
permit Type 2 oligonucleotide probes such as those listed in 
Table 6 to form a stable duplex both with polypeptides 
comprising SEQ. ID NO: 1 or SEQ. ID NO:3, but not a 
sequence selected from the group consisting of SEQ. ID 
NO:5— 13. Conditions can also be determined that would 
permit a suitable fragment comprising at least 21 or more 
consecutive bases of SEQ. ID NO: 1 or SEQ. ID NO:3 to 
form a stable duplex both with a polynucleotide comprising 
SEQ. ID NO:l and SEQ. ID NO:3, but not a polynucleotide 
comprising any one of SEQ. ID NO:5— 13. 

Alternatively, to conduct an assay that is specific due to 
amplification with a specific primer: DNA or RNA is pre- 
pared from the biological sample as before. Optionally, the 
target polynucleotide is pre- amplified in a PCR using prim- 
ers which are not species specific, such as those listed in 
Table 4 or 6. The target is then amplified using specific 
primers, such as those listed in Table 7, or a combination of 
primers from Table 4, 6, and 7. In a preferred embodiment, 
two rounds of amplification are performed, using oligo- 
nucleotide primers in a nested fashion: virus-specific or 
non-specific in the first round; virus-specific in the second 
round. This provides an assay which is both sensitive and 
specific. 

Use of a specific Type 3 primer during amplification is 
sufficient to provide the required specificity. A positive test 
may be indicated by the presence of sufficient reaction 
product at the end of the amplification series. Amplified 
polynucleotide can be detected, for example, by blotting the 
reaction mixture onto a medium such as nitrocellulose and 
staining with ethidium bromide. Alternatively, a radiola- 
beled substrate may be added to the mixture during a final 
amplification cycle; the incorporated label may be separated 
from unincorporated label (e.g., by blotting or by size 
separation), and the label may be detected (e.g. by counting 
or by autoradiography). If run on a gel of agarose or 
poly aery lamide, the size of the product may help confirm the 
identity of the amplified fragment. Specific amplification 
can also be followed by specific hybridization, by using the 
amplification mixture obtained from the foregoing proce- 
dure as a target source for the hybridization reaction outlined 
earlier. 

Use of polynucleotides for gene therapy 

Embodied in this invention are pharmaceuticals compris- 
ing virus-specific polynucleotides, polypeptides, or antibod- 
ies as an active ingredient. Such compositions may decrease 
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the pathology of the virus or infected cells on their own, or 
render the virus or infected cells more susceptible to treat- 
ment by non-specific pharmaceutical compounds. 

Polynucleotides of this invention encoding part of a 
herpes virus Glycoprotein B may be used, for example, for 5 
administration to an infected individual for purposes of gene 
therapy (see generally U.S. Pat. No. 5,399,346: Anderson et 
al.). The general principle is to administer the polynucle- 
otide in such a way that it ether promotes or attenuates the 
expression of the polypeptide encoded therein. 10 

A preferred mode of gene therapy is to provide the 
polynucleotide in such a way that it will be replicated inside 
the cell, enhancing and prolonging the effect. Thus, the 
polynucleotide is operatively linked to a suitable promoter, 
such as the natural promoter of the corresponding gene, a 15 
heterologous promoter that is intrinsically active in cells of 
the target tissue type, or a heterologous promoter that can be 
induced by a suitable agent. Entry of the polynucleotide into 
the cell is facilitated by suitable techniques known in the art, 
such as providing the polynucleotide in the form of a 20 
suitable vector, such as a viral expression vector, or encap- 
sulation of the polynucleotide in a liposome. The polynucle- 
otide may be injected systemically, or provided to the site of 
infection by an antigen -specific homing mechanism, or by 
direct injection. 25 

In one variation, the polynucleotide comprises a promoter 
linked to the polynucleotide strand with the same orientation 
as the strand that is transcribed during the course of a herpes 
virus infection. Preferably, the Glycoprotein B that is 
encoded includes an external component, a transmembrane 30 
component, and signal sequences for transport to the sur- 
face. Virally infected cells transfected with polynucleotides 
of this kind are expected to express an enhanced level of 
Glycoprotein B at the surface. Enhancing Glycoprotein B 
expression in this fashion may enhance recognition of these 35 
cells by elements of the immune system, including antibody 
(and antibody -dependent effectors such as ADCC), and 
virus-specific cytotoxic T cells. 

In another variation, the polynucleotide comprises a pro- 
moter linked to the polynucleotide strand with the opposite 40 
orientation as the strand that is transcribed during the course 
of a herpes virus infection. Virally infected cells transfected 
with polynucleotides of this kind are expected to express a 
decreased level of Glycoprotein B. The transcript is 
expected to hybridize with the complementary strand tran- 45 
scribed by the viral gene, and prevent it from being trans- 
lated. This approach is known as anti-sense therapy. 
RFHV/KSHV subfamily polypeptides with Glycoprotein B 
activity and fragments thereof 

The RFHV and KSHV polynucleotide sequences shown 50 
in FIG. 1 have open reading frames. The polypeptide 
encoded thereby are shown in SEQ. ID NO: 2 and SEQ. ID 
NO: 4, respectively. Encoded between the hybridizing 
regions of the primers NIVPA and TVNCB used to obtain 
the polynucleotide sequence is a 106 amino acid fragment of 55 
the Glycoprotein B molecule which is 91% identical 
between RFHV and KSHV. The full protein sequence of 
KSHV Glycoprotein B is shown in SEQ. ID NO:94. A 
Glycoprotein B fragment of a third member of the RFHV/ 
KSHV subfamily, RFHV2, is shown in SEQ. ID NO: 97. 60 

There are a number of homologous residues to Glyco- 
protein B molecules of other sequenced herpes viruses. The 
longest sequence contained in SEQ. ID NO:2 or SEQ. ID 
NO: 4 but not in the known sequences of other herpes viruses 
is 9 amino acids in length, with two exceptions (SEQ. ID 65 
NOS:64 and 65). Longer matching portions are found else- 
where in the Glycoprotein B amino acid sequence. The 
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longest is the 21 amino acid sequence from bHV4 shown in 
SEQ. ID NO: 99; the rest are all 16 amino acids long or less. 
Other than SEQ. ID NO: 99 exception, any fragment of the 
RFHV and KSHV Glycoprotein B protein sequence that is 
17 amino acids or longer is believed to be specific for RFHV 
or KSHV, respectively, or to closely related strains. Since 
bHV4 and the other viruses with matching segments are not 
believed to be capable of infecting primates, any fragment of 
about 10 amino acids or more found in a primate that was 
contained in SEQ. ID NO: 4 would indicate the presence of 
an infectious agent closely related to KSHV. 

This invention embodies both intact Glycoprotein B from 
herpes viruses of the RFHV/KSHV subfamily, and any 
fragment thereof that is specific for the subfamily. Preferred 
Glycoprotein B fragments of this invention are at least 10 
amino acids in length; more preferably they are at least 13 
amino acids in length; more preferably they are at least 17 
amino acids in length; more preferably they are at least about 
20 amino acids in length; even more preferably they are at 
least about 25 amino acids in length, still more preferably 
they are at least about 30 amino acids in length. 

The amino acid sequence of the RFHV and KSHV 
Glycoprotein B fragment shown in SEQ. ID NOS:2, 4, 94 
and 96 can be used to identify virus-specific and cross- 
reactive antigenic regions. 

In principle, a specific antibody could recognize any 
amino acid difference between sequences that is not also 
shared by the species from which the antibody is derived. 
Antibody binding sites are generally big enough to encom- 
pass 5—9 amino acid residues of an antigen, and are quite 
capable of recognizing a single amino acid difference. 
Specific antibodies may be part of a polyclonal response 
arising spontaneously in animals infected with a virus 
expressing the Glycoprotein B. Specific antibodies may also 
be induced by injecting an experimental animal with either 
the intact Glycoprotein B or a Glycoprotein B fragment. 

Thus, any peptide of 5 amino acids or more that is unique 
to KSHV is a potential virus-specific antigen, and could be 
recognized by a KSHV-specific antibody. Similarly, any 
peptide of sufficient length shared within the RFHV/KSHV 
subfamily but not with other herpes viruses is a potential 
subfamily-specific antigen. 

Some examples of preferred peptides are shown in Table 
8. Practitioners in the art will immediately recognize that 
other peptides with similar specificities may be designed by 
minor alterations to the length of the peptides listed and/or 
moving the frame of the peptide a few residues in either 
direction. 

The Class I peptides shown in Table 8 are conserved 
between Glycoprotein B of KSHV and that of certain other 
members of the gamma herpes virus subfamily. An antibody 
directed against one such Glycoprotein B in this region may 
therefore cross-react with some of the others. Class II 
peptides are conserved between Glycoprotein B of RFHV 
and KSHV, but not with other gamma herpes viruses. An 
antibody directed against this region is expected to cross- 
react between RFHV, KSHV, and other viruses of the 
RFHV/KSHV subfamily; but not with herpes viruses outside 
the subfamily. Class III peptides are different between 
Glycoprotein B of RFHV, KSHV, and other known gamma 
herpes viruses. An antibody binding to this region, particu- 
larly to non-identical residues contained therein, is expected 
to distinguish RFHV and KSHV Glycoprotein B from each 
other, and from Glycoprotein B of more distantly related 
herpes viruses. 
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TABLE 8 

Antigen Peptides 



SEQ. 

Specificity Sequence Length ID NO: 



Class I: Shared with 


YRKATSVTVYRG 


13 


64 


bHV4 








bHV4, mHV68 


RYFSQP 


6 


66 


Shared amongst RFHV/KSHV bHV4 


IYAEP GWFPGIYRVR 


15 


65 


subfamily and some other 


IYAEP GWFPGIYRVRTTVNCE 


21 


99 


gamma herpes viruses 








mHV6S 


VLEELSRAWCREQVRD 


16 


100 


Class II: 


VTVYRG 


6 


67 




AITNKYE 


7 


68 


Shared amongst RFHV/KSHV 


SHMDSTY 


7 


69 


subfamily 










VENTFTD 


7 


70 




TVFLQPV 


7 


71 




TDNIQRY 


7 


72 


Class III: Specific for 








RFHV 


RGMTEAA 


7 


73 


Virus specific 1 KSHV 


RGTTESA 


7 


75 


RFHV 


PVLYSEP 


7 


74 


KSHV 


PVIYAEP 


7 


76 



1 - Not shared with any other sequenced herpes virus; may be present in some unsequenced RFHV/KSHV 
subfamily viruses 



Particularly preferred peptides from Class III are those 
encompassing regions of Glycoprotein B with the polarity 
characteristics appropriate for an antigen epitope, as 
described in the Example section. Given the complete 
sequence of a Glycoprotein B from KSHV and other mem- 
bers of the RFHV/KSHV subfamily, virus- or subfamily- 
specific peptides can be predicted for other regions of the 
molecule by a similar analysis. 
Preparation of polypeptides 

Polypeptides of this invention may be prepared by several 
different methods, all of which will be known to a practi- 
tioner of ordinary skill. 

For example, short polypeptides of about 5—50 amino 
acids in length are conveniently prepared from sequence 
data by chemical synthesis. A preferred method is the 
solid -phase Merrifield technique. Alternatively, a messenger 
RNA encoding the desired polypeptide may be isolated or 
synthesized according to one of the methods described 
earlier, and translated using an in vitro translation system, 
such as the rabbit reticulocyte system. See, e.g., Dorsky et 
al. 

Longer polypeptides, up to and including the entire Gly- 
coprotein B, are conveniently prepared using a suitable 
expression system. For example, the encoding strand of a 
full-length cDNA can be operatively linked to a suitable 
promoter, inserted into an expression vector, and transfected 
into a suitable host cell. The host cell is then cultured under 
conditions that allow transcription and translation to occur, 
and the polypeptide is subsequently recovered. For 
examples of the expression and recovery of Glycoprotein B 
from other species of herpes virus, see, for example, U.S. 
Pat. No. 4,642,333 (Person); U.S. Pat. No. 5,244,792 (Burke 
et al.); Manse rvigi et al. 

For many purposes, it is particularly convenient to use a 
recombinant Glycoprotein B polynucleotide that includes 
the regions encoding signals for transport to the cell surface, 
but lacks the region encoding the transmembrane domain of 
the protein. The polynucleotide may be truncated 5' to the 
transmembrane encoding region, or it may comprise both 
extracellular and cytoplasmic encoding region but lack the 
transmembrane region. Constructs of this nature are 
expected to be secreted from the cell in a soluble form. 



Where it is desirable to have a Glycoprotein B fragment that 
is a monomer, the recombinant may be designed to limit 
translation to about the first 475 amino acids of the protein. 

30 For example, to express any of these forms of Glycopro- 
tein B in yeast, a cassette may be prepared using the 
glyceraldehyde -3 -phosphate -dehydrogenase (GAPDH) pro- 
moter region and terminator region. GAPDH gene frag- 
ments are identified in a yeast library, isolated and ligated in 
the appropriate configuration. The cassette is cloned into 

35 pBR322, isolated and confirmed by DNA sequencing. A 
pCl/1 plasmid is constructed containing a Glycoprotein B 
insert and GAPDH promoter and terminator regions. The 
plasmid is used to transform yeast strain S. cerevisiae. After 
culture, the yeast cells are pelleted by centrifugation, resus- 

40 pended in a buffer containing protease inhibitors such as 1 
mM phenylmethylsulfonyl fluoride and 0.1 //g/ml pepstatin. 
The washed cells are disrupted by vortexing with glass 
beads and recentrifuged. The presence in the supernatant of 
a Glycoprotein B of the correct size may be confirmed, for 

45 example, by Western blot using an antibody against Glyco- 
protein B, prepared as described in a following section. 
Glycoprotein B may be purified from the supernatant by a 
combination of standard protein chemistry techniques, 
including ion exchange chromatography, affinity chroma- 

50 tography using antibody or substrate, and high-pressure 
liquid chromatography. 

To express Glycoprotein B in mammalian cells, for 
example, a mammalian expression vector such as pSVl/dhfr 
may be used. This has an ampicillin-resistance beta- 

55 lactamase gene, and a selectable mammalian cell marker, 
dihydrofolate reductase linked to the SV40 early promoter. 
Glycoprotein B polynucleotide blunt-end fragments are 
ligated into the pSVl/dhfr vector and digested with endo- 
nucleases to provide a cassette including the SV40 promoter, 

60 the Glycoprotein encoding region, and the SV40 splice and 
polyadenylation sites. The plasmids are used, for example, 
to transform CHO cells deficient in dhfr, and transfectants 
are selected. Cells expressing Glycoprotein B may be 
identified, for example, by immunofluorescence using anti- 

65 Glycoprotein B as the primary antibody. 

In another example, recombinant plasmids for expressing 
Glycoprotein B are cloned under control of the Rous sar- 
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coma virus long terminal repeat in the episomal replicating 
vector pRP-RSV. This plasmid contains the origin of repli- 
cation and early region of the human papovavirus BK, as 
well as the dhfr resistance marker. The vector is can then be 
used, for example, to transform human 293 cells. By using 5 
a Glycoprotein B encoding region devoid of the transmem- 
brane spanning domain, the Glycoprotein B polypeptide is 
constitutive ly secreted into the culture medium at 0.15—0.25 
pg/cell/day. In the presence of 0.6—6 juM. methotrexate, 
production may be increased 10- to 100-fold, because of an 10 
amplification of the episomal recombinant. Glycoprotein B 
prepared in this way are appropriate, inter alia, for use in 
diagnosis, and to prepare vaccines protective against new 
and recurrent herpes virus infections (Manse rvigi et al). 
Use of polypeptides to assess herpes virus infection 15 

The polypeptides embodied in this invention may be used 
to detect or assess the status of a herpes virus infection in an 
individual in several different applications. 

In one application, a polypeptide encoding a portion of a 
herpes virus Glycoprotein B is supplied as a reagent for an 20 
assay to detect the presence of antibodies that can specifi- 
cally recognize it. Such antibodies may be present, for 
example, in the circulation of an individual with current or 
past herpes virus exposure. 

The presence of antibodies to Glycoprotein B in the 25 
circulation may provide a sensitive and early indication of 
viral infection. Since Glycoprotein B is a functional com- 
ponent of the viral envelope, it is produced in greater 
quantity than other transcripts sequestered within the viral 
particle. Its distribution is wider than transcripts that appear 30 
only transiently in the life cycle of the virus. Furthermore, it 
may be expressed not only by intact virus, but also by 
non-infective products of virally infective cells, such as 
L-p articles. Glycoprotein B from various species of herpes 
virus are known to be strongly immunogenic. Thus, detec- 35 
tion of antibody to Glycoprotein B in an individual may be 
an indication of ongoing active herpes virus infection, latent 
infection, previous exposure, or treatment with a Glycopro- 
tein B vaccine. 

Suitable clinical samples in which to measure antibody 40 
levels include serum or plasma from an individual suspected 
of having a gamma herpes virus infection. The presence of 
the antibody is determined, for example, by an immunoas- 
say. 

A number of immunoassay methods are established in the 45 
art for performing the quantitation of antibody using viral 
peptides (see, e.g., U.S. Pat. No. 5,350,671: Houghton et 
al.). For example, the test sample potentially containing the 
specific antibody may be mixed with a pre-determined 
non-limiting amount of the reagent polypeptide. The reagent 50 
may contain a directly attached label, such as an enzyme or 
a radioisotope. For a liquid-phase assay, unre acted reagents 
are removed by a separation technique, such as filtration or 
chromatography. Alternatively, the antibody in the sample 
may be first captured by a reagent on a solid phase. This may 55 
be, for example, the specific polypeptide, an anti- 
immunoglobulin, or Protein A. The captured antibody is 
then detected with a second reagent, such as the specific 
polypeptide, anti-immunoglobulin, or protein A with an 
attached label. At least one of the capture reagent or the 60 
detecting reagent must be the specific polypeptide. In a third 
variation, cells or tissue sections containing the polypeptide 
may be overlaid first with the test sample containing the 
antibody, and then with a detecting reagent such as labeled 
anti-immunoglobulin. In all these examples, the amount of 65 
label captured in the complex is positively related to the 
amount of specific antibody present in the test sample. 
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Similar assays can be designed in which antibody in the test 
sample competes with labeled antibody for binding to a 
limiting amount of the specific peptide. The amount of label 
in the complex is then negatively correlated with the amount 
of specific antibody in the test sample. Results obtained 
using any of these assays are compared between test 
samples, and control samples from an uninfected source. 

By selecting the reagent polypeptide appropriately, anti- 
bodies of a desired specificity may be detected. For example, 
if the intact Glycoprotein B is used, or a fragment compris- 
ing regions that are conserved between herpes virus, then 
antibodies detected in the test samples may be virus specific, 
cross-reactive, or both. A multi-epitope reagent is preferred 
for a general screening assay for antibodies related to herpes 
virus infection. To render the assay specific for antibodies 
directed either against RFHV or against KSHV, antigen 
peptides comprising non-conserved regions of the appropri- 
ate Glycoprotein B molecule are selected, such as those 
listed in Class III of Table 8. Preferably, a mixture of such 
peptides is used. To simultaneously detect antibodies against 
RFHV, KSHV, and closely related viruses of the gamma 
herpes family, but not sHVl and EBV, antigen peptides are 
selected with the properties of those listed in Class II of 
Table 8. Preferably, a mixture of such peptides is used. 

Antibodies stimulated during a herpes virus infection may 
subside once the infection resolves, or they may persist as 
part of the immunological memory of the host. In the latter 
instance, antibodies due to current infection may be distin- 
guished from antibodies due to immunological memory by 
determining the class of the antibody. For example, an assay 
may be conducted in which antibody in the test sample is 
captured with the specific polypeptide, and then developed 
with labeled anti-IgM or anti-IgG. The presence of specific 
antibody in the test sample of the IgM class indicates 
ongoing infection, while the presence of IgG antibodies 
alone indicates that the activity is due to immunological 
memory of a previous infection or vaccination. 
Use of polypeptides to design or screen anti-viral drugs 

Interfering with the Glycoprotein B gene or gene product 
would modify the infection process, or the progress of this 
disease. It is an objective of this invention to provide a 
method by which useful pharmaceutical compositions and 
methods of employing such compounds in the treatment of 
gamma herpes virus infection can be developed and tested. 
Particularly preferred are pharmaceutical compounds useful 
in treating infections by RFHV, KSHV and other members 
of the RFHV/KSHV subfamily. Suitable drugs are those that 
interfere with transcription or translation of the Glycoprotein 
B gene, and those that interfere with the biological function 
of the polypeptide encoded by the gene. It is not necessary 
that the mechanism of interference be known; only that the 
interference be preferential for reactions associated with the 
infectious process. 

Preferred drugs include those that competitively interfere 
with the binding of the Glycoprotein B to its substrate on 
target cells, such as heparan sulfate and its analogs. Also 
preferred are drugs that competitively interfere with any 
interaction of Glycoprotein B to other viral envelope com- 
ponents that may be necessary for the virus to exert one of 
its biologic functions, such as penetration of target cells. 
Also preferred are molecules capable of cross-linking or 
otherwise immobilizing the Glycoprotein B, thereby pre- 
venting it from binding its substrate or performing any 
biological function that plays a role in viral infectivity. 

This invention provides methods for screening pharma- 
ceutical candidates to determine which are suitable for 
clinical use. The methods may be brought to bear on 
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antiviral compounds that are currently known, and those 
which may be designed in the future. 

The method involves combining an active Glycoprotein B 
with the pharmaceutical candidate, and determining whether 
the biochemical function is altered by the pharmaceutical 5 
candidate. The Glycoprotein B may be any fragment 
encoded by the Glycoprotein B gene of the RFHV/KSHV 
subfamily that has Glycoprotein B activity. Suitable frag- 
ments may be obtained by expressing a genetically engi- 
neered polypeptide encoding an active site of the molecule, 10 
or by cleaving the Glycoprotein B with proteases and 
purifying the active fragments. In a preferred embodiment, 
the entire Glycoprotein B is provided. The reaction mixture 
will also comprise other components necessary to measure 
the biological activity of the protein. For example, in an 15 
assay to measure substrate binding, heparan sulfate or an 
analog thereof may be provided, perhaps linked to a solid 
support to facilitate measurement of the binding reaction. 

One embodiment of the screening method is to measure 
binding of the pharmaceutical candidate directly to the 20 
isolated Glycoprotein B, or a fragment thereof. Compounds 
that bind to an active site of the molecule are expected to 
interfere with Glycoprotein B activity. Thus, the entire 
Glycoprotein B, or a fragment comprising the active site, is 
mixed with the pharmaceutical candidate. Binding of the 25 
candidate can be measured directly, for example, by pro- 
viding the candidate in a radiolabeled or stable -isotope 
labeled form. The presence of label bound to the Glycopro- 
tein B can be determined, for example, by precipitating the 
Glycoprotein B with a suitable antibody, or by providing the 30 
molecule attached to a solid phase, and washing the solid 
phase after the reaction. Binding of the candidate to the 
Glycoprotein B may also be observed as a conformational 
change, detected for example by difference spectroscopy, 
nuclear magnetic resonance, or circular dichroism. 35 
Alternatively, binding may be determined in a competitive 
assay: for example, Glycoprotein B is mixed with the 
candidate, and then labeled nucleotide or a fragment of a 
regulatory subunit is added later. Binding of the candidate to 
the biochemically relevant site should inhibit subsequent 40 
binding of the labeled compound. 

A second embodiment of the screening method is to 
measure the ability of the pharmaceutical candidate to 
inhibit the binding of Glycoprotein B to a substrate or 
substrate analog. A preferred analog is heparin, coupled a 45 
solid support such as Sepharose™ beads, inhibition may be 
measured, for example, by providing a radiolabel to the 
Glycoprotein B, incubating it with the pharmaceutical 
candidate, adding the affinity resin, then washing and count- 
ing the resin to determine if the candidate has decreased the 50 
amount of radioactivity bound. Pharmaceutical candidates 
may also be tested for their ability to competitively interfere 
with interactions between Glycoprotein B and other herpes 
virus proteins. 

A third embodiment of the screening method is to mea- 55 
sure the ability of the pharmaceutical candidate to inhibit an 
activity of an active particle, such as a viral particle, medi- 
ated by Glycoprotein B. A particle is engineered to express 
Glycoprotein B, but not other components that are capable 
of mediating the same function. The ability of the particle to 60 
exhibit a biological function, such as substrate binding or 
membrane fusion, is then measured in the presence and 
absence of the pharmaceutical candidate by providing an 
appropriate target. 

This invention also provides for the development of 65 
pharmaceuticals for the treatment of herpes infection by 
rational drug design. See, generally, Hodgson, and Erickson 
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et al. In this embodiment, the three-dimensional structure of 
the Glycoprotein B is determined, either by predictive 
modeling based on the amino acid sequence, or preferably, 
by experimental determination. Experimental methods 
include antibody mapping, mutational analysis, and the 
formation of anti-idio types. Especially preferred is X-ray 
crystallography. Knowing the three-dimensional structure of 
the glycoprotein, especially the orientation of important 
amino acid groups near the substrate binding site, a com- 
pound is designed de novo, or an existing compound is 
suitably modified. The designed compound will have an 
appropriate charge balance, hydrophobicity, and/or shape to 
permit it to attach near an active site of the Glycoprotein B, 
and sterically interfere with the normal biochemical function 
of that site. Preferably, compounds designed by this method 
are subsequently tested in a drug screening assay, such as 
those outlined above. 

Antibodies against Glycoprotein B and their preparation 

The amino acid sequence of the Glycoprotein B mol- 
ecules embodied herein are foreign to the hosts they infect. 
Glycoprotein B from other species of herpes virus are 
known to be strongly immunogenic in mammals. Anti- 
Glycoprotein B is formed in humans, for example, as a usual 
consequence of infection with hCMV. By analogy, it is 
expected that Glycoprotein B of RFHV, KSHV, and other 
members of the RFHV/KSHV subfamily will be immuno- 
genic in mammals, including humans. These expectations 
are supported by the observations described in the Example 
section below. 

Antibodies against a polypeptide are generally prepared 
by any method known in the art. To stimulate antibody 
production in an animal experimentally, it is often preferable 
to enhance the immunogenicity of a polypeptide by such 
techniques as polymerization with glut ar aldehyde, or com- 
bining with an adjuvant, such as Freund's adjuvant. The 
immunogen is injected into a suitable experimental animal: 
preferably a rodent for the preparation of monoclonal anti- 
bodies; preferably a larger animal such as a rabbit or sheep 
for preparation of polyclonal antibodies. It is preferable to 
provide a second or booster injection after about 4 weeks, 
and begin harvesting the antibody source no less than about 
1 week later. 

Sera harvested from the immunized animals provide a 
source of polyclonal antibodies. Detailed procedures for 
purifying specific antibody activity from a source material 
are known within the art. If desired, the specific antibody 
activity can be further purified by such techniques as protein 
A chromatography, ammonium sulfate precipitation, ion 
exchange chromatography, high-performance liquid chro- 
matography and immuno affinity chromatography on a col- 
umn of the immunizing polypeptide coupled to a solid 
support. 

Polyclonal antibodies raised by immunizing with an intact 
Glycoprotein B or a fragment comprising conserved 
sequences may be cross-reactive between herpes viruses. 
Antibodies that are virus or subfamily specific may be raised 
by immunizing with a suitably specific antigen, such as 
those listed above in Table 8. Alternatively, polyclonal 
antibodies raised against a larger fragment may be rendered 
specific by removing unwanted activity against other virus 
Glycoprotein B's, for example, by passing the antibodies 
over an adsorbent made from Glycoprotein B and collecting 
the unbound fraction. 

Alternatively, immune cells such as splenocytes can be 
recovered from the immunized animals and used to prepare 
a monoclonal antibody-producing cell line. See, for 
example, Harrow & Lane (1988), U.S. Pat. No. 4,472,500 
(Milstein et al.), and U.S. Pat. No. 4,444,887 (Hoffman et 
al.). 
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Briefly, an antibody-producing line can be produced inter 
alia by cell fusion, or by transforming antibody-producing 
cells with Epstein Barr Virus, or transforming with onco- 
genic DNA. The treated cells are cloned and cultured, and 
clones are selected that produce antibody of the desired 5 
specificity. Specificity testing can be performed on culture 
supernatants by a number of techniques, such as using the 
immunizing polypeptide as the detecting reagent in a stan- 
dard immunoassay, or using cells expressing the polypeptide 
in immunohistochemistry. A supply of monoclonal antibody 10 
from the selected clones can be purified from a large volume 
of tissue culture supernatant, or from the ascites fluid of 
suitably prepared host animals injected with the clone. 

Effective variations of this method include those in which 
the immunization with the polypeptide is performed on 15 
isolated cells. Antibody fragments and other derivatives can 
be prepared by methods of standard protein chemistry, such 
as subjecting the antibody to cleavage with a proteolytic 
enzyme. Genetically engineered variants of the antibody can 
be produced by obtaining a polynucleotide encoding the 20 
antibody, and applying the general methods of molecular 
biology to introduce mutations and translate the variant. 

Monoclonal antibodies raised by injecting an intact Gly- 
coprotein B or a fragment comprising conserved sequences 
may be cross -re active between herpes viruses. Antibodies 25 
that are virus or subfamily specific may be raised by 
immunizing with a suitably specific antigen, as may be 
selected from Table 8. Alternatively, virus-specific clones 
may be selected from the cloned hybridomas by using a 
suitable antigen, such as one selected from Class III of Table 30 
8, in the screening process. 

Specific antibodies against herpes virus Glycoprotein B 
have a number of uses in developmental, diagnostic and 
therapeutic work. For example, antibodies can be used in 
drug screening (see U.S. Pat. No. 5,120,639). They may also 35 
be used as a component of a passive vaccine, or for detecting 
herpes virus in a biological sample and for drug targeting, as 
described in the following sections. 

Anti-idio types relating to Glycoprotein B may also be 
prepared. This is accomplished by first preparing a Glyco- 40 
protein B antibody, usually a monoclonal antibody, accord- 
ing to the aforementioned methodology. The antibody is 
then used as an immunogen in a volunteer or experimental 
animal to raise an anti-idio type. The anti-idio type may be 
either monoclonal or polyclonal, and its development is 45 
generally according to the methodology used for the first 
antibody. Selection of the anti-idiotype or hybridoma clones 
expressing anti-idiotype is done using the immunogen anti- 
body as a positive selector, and using antibodies of unrelated 
specificity as negative selectors. Usually, the negative selec- 50 
tor antibodies will be a polyclonal immunoglobulin prepa- 
ration or a pool comprising monoclonal immunoglobulins of 
the same immunoglobulin class and subclass, and the same 
species as the immunogen antibody. An anti-idiotype may be 
used as an alternative component of an active vaccine 55 
against Glycoprotein B. 

Use of antibodies for detecting Glycoprotein B in biological 
samples 

Antibodies specific for Glycoprotein B can be used to 
detect Glycoprotein B polypeptides and fragments of viral 60 
origin that may be present, for example, in solid tissue 
samples and cultured cells. Immunohistological techniques 
to carry out such determinations will be obvious to a 
practitioner of ordinary skill. Generally, the tissue is pre- 
served by a combination of techniques which may include 65 
freezing, exchanging into different solvents, fixing with 
agents such as paraformaldehyde, drying with agents such as 
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alcohol, or embedding in a commercially available medium 
such as paraffin or OCT. A section of the sample is suitably 
prepared and overlaid with a primary antibody specific for 
the protein. 

The primary antibody may be provided directly with a 
suitable label. More frequently, the primary antibody is 
detected using one of a number of developing reagents 
which are easily produced or available commercially. 
Typically, these developing reagents are anti- 
immunoglobulin or protein A, and they typically bear labels 
which include, but are not limited to: fluorescent markers 
such as fluorescein, enzymes such as peroxidase that are 
capable of precipitating a suitable chemical compound, 
electron dense markers such as colloidal gold, or radioiso- 
topes such as 125 I. The section is then visualized using an 
appropriate microscopic technique, and the level of labeling 
is compared between the suspected virally infected and a 
control cell, such as cells surrounding the area of infection 
or taken from a remote site. 

Proteins encoded by a Glycoprotein B gene can also be 
detected in a standard quantitative immunoassay. If the 
protein is secreted or shed from infected cell in any appre- 
ciable amount, it may be detectable in plasma or serum 
samples. Alternatively, the target protein may be solubilized 
or extracted from a solid tissue sample. Before quantitating, 
the protein may optionally be affixed to a solid phase, such 
as by a blot technique or using a capture antibody. 

A number of immunoassay methods are established in the 
art for performing the quantitation. For example, the protein 
may be mixed with a pre-determined non-limiting amount of 
the reagent antibody specific for the protein. The reagent 
antibody may contain a directly attached label, such as an 
enzyme or a radioisotope, or a second labeled reagent may 
be added, such as anti-immunoglobulin or protein A. For a 
solid -phase assay, unre acted reagents are removed by wash- 
ing. For a liquid-phase assay, unreacted reagents are 
removed by some other separation technique, such as filtra- 
tion or chromatography. The amount of label captured in the 
complex is positively related to the amount of target protein 
present in the test sample. A variation of this technique is a 
competitive assay, in which the target protein competes with 
a labeled analog for binding sites on the specific antibody. In 
this case, the amount of label captured is negatively related 
to the amount of target protein present in a test sample. 
Results obtained using any such assay are compared 
between test samples, and control samples from an unin- 
fected source. 

Use of antibodies for drug targeting 

An example of how antibodies can be used in therapy of 
herpes virus infection is in the specific targeting of effector 
components. Virally infected cells generally display pep- 
tides of the virus, especially proteins expressed on the 
outside of the viral envelope. The peptide therefore provides 
a marker for infected cells that a specific antibody can bind 
to. An effector component attached to the antibody therefore 
becomes concentrated near the infected cells, improving the 
effect on those cells and decreasing the effect on uninfected 
cells. Furthermore, if the antibody is able to induce 
endocytosis, this will enhance entry of the effector into the 
cell interior. 

For the purpose of targeting, an antibody specific for the 
viral polypeptide (in this case, a region of a Glycoprotein B) 
is conjugated with a suitable effector component, preferably 
by a covalent or high-affinity bond. Suitable effector com- 
ponents in such compositions include radionuclides such as 
131 I, toxic chemicals, and toxic peptides such as diphtheria 
toxin. Another suitable effector component is an antisense 
polynucleotide, optionally encapsulated in a liposome. 
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Diagnostic kits 

Diagnostic procedures using the polynucleotides, 
oligonucleotides, peptides, or antibodies of this invention 
may be performed by diagnostic laboratories, experimental 
laboratories, practitioners, or private individuals. This 5 
invention provides diagnostic kits which can be used in 
these settings. The presence of a herpes virus in the indi- 
vidual may be manifest in a clinical sample obtained from 
that individual as an alteration in the DNA, RNA, protein, or 
antibodies contained in the sample. An alteration in one of 10 
these components resulting from the presence of a herpes 
virus may take the form of an increase or decrease of the 
level of the component, or an alteration in the form of the 
component, compared with that in a sample from a healthy 
individual. The clinical sample is optionally pre -treated for 15 
enrichment of the target being tested for. The user then 
applies a reagent contained in the kit in order to detect the 
changed level or alteration in the diagnostic component. 

Each kit necessarily comprises the reagent which renders 
the procedure specific: a reagent polynucleotide, used for 20 
detecting target DNA or RNA; a reagent antibody, used for 
detecting target protein; or a reagent polypeptide, used for 
detecting target antibody that may be present in a sample to 
be analyzed. The reagent is supplied in a solid form or liquid 
buffer that is suitable for inventory storage, and later for 25 
exchange or addition into the reaction medium when the test 
is performed. Suitable packaging is provided. The kit may 
optionally provide additional components that are useful in 
the procedure. These optional components include buffers, 
capture reagents, developing reagents, labels, reacting 30 
surfaces, means for detection, control samples, instructions, 
and interpretive information. 
Other members of the RFHV/KSHV subfamily 

RFHV and KSHV are exemplary members of the RFHV/ 
KSHV subfamily. This invention embodies polynucleotide 35 
sequences encoding Glycoprotein B of other members of the 
subfamily, as defined herein. The consensus-degenerate 
gamma herpes virus oligonucleotide Type 1 and 2 primers, 
and the methods described herein are designed to be suitable 
for characterization of the corresponding polynucleotide 40 
fragment of other members of the RFHV/KSHV subfamily. 
One such member is another virus infecting monkeys, 
designated RFHV2. A segment of the Glycoprotein encod- 
ing sequence for this virus was cloned from RF tissue 
obtained from a Macaca mulatta monkey, as described in 45 
Example 12. 

In order to identify and characterize other members of the 
family, reagents and methods of this invention are applied to 
DNA extracted from tissue samples suspected of being 
infected with such a virus. 50 

Suitable sources of DNA for this purpose include biologi- 
cal samples obtained from a wide range of conditions 
occurring in humans and other vertebrates. Preferred are 
conditions in which the agent is suspected of being 
lymphotrophic, similar to other members of the gamma 55 
herpes virus subfamily; for example, infectious mono- 
nucleosis of non-EBV origin. More preferred are conditions 
which resemble in at least one of their clinical or histological 
features the conditions with which RFHV or KSHV are 
associated. These include: a) conditions in which fibropro- 60 
liferation is part of the pathology of the disease, especially 
in association with collagen deposition, and especially 
where the fibrous tissue is disorganized; b) conditions 
involving vascular dysplasia; c) conditions involving malig- 
nant transformation, especially but not limited to cells of 65 
lymphocyte lineage; d) conditions for which an underlying 
immunodeficiency contributes to the frequency or severity 
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of the disease; e) conditions which arise idiopathic ally at 
multiple sites in an organ or in the body as a whole; f) 
conditions which epidemiological data suggests are associ- 
ated with an infectious or environmental agent. Conditions 
which fulfill more than one of these criteria are comparably 
more preferred. Some examples of especially preferred 
conditions include retroperitoneal fibrosis, nodular 
fibromatosis, pseudosarcom atous fibromatosis, 
fibrosarcomas, sclerosing mesenteritis, acute respiratory dis- 
ease syndrome, idiopathic pulmonary fibrosis, diffuse pro- 
liferative glomerulonephritis of various types, gliomas, 
glioblastomas, gliosis, and all types of leukemias and lym- 
phomas. 

The type of tissue sample used will depend on the clinical 
manifestations of the condition. Samples more likely to 
contain a virus associated with the condition may be taken 
from the site involved in the disease pathology, or to which 
there is some other evidence of viral tropism. Peripheral 
blood mononuclear cells of an infected individual may also 
act as a carrier of an RFHV/KSHV subfamily virus. KSHV 
has been detected in PBMC of both Kaposi's Sarcoma 
(Moore et al. 1995b) and Castleman's disease (Dupin et al.). 
Other suitable sources are cell cultures developed from such 
sources, and enriched or isolated preparations of virus 
obtained from such sources. For negative control samples, 
tissue may be obtained from apparently unaffected sites on 
the same individuals, or from matched individuals who 
apparently do not suffer from the condition. 

The process of identification of members of the RFHV/ 
KSHV subfamily preferably involves the use of the methods 
and reagents provided in this invention, either singularly or 
in combination. 

One method involves amplifying a polynucleotide encod- 
ing a herpes virus Glycoprotein B from DNA extracted from 
the sample. This can be performed, for example, by ampli- 
fying the polynucleotide in a reaction such as a PCR. In one 
variation, the amplification reaction is primed using broadly 
specific consensus-degenerate Type 1 oligonucleotides, such 
as those shown in Table 4. This will amplify herpes viruses, 
primarily of the gamma type. Since the RFHV/KSHV sub- 
family is a subset of gamma herpes viruses, Glycoprotein B 
sequences detected by this variation need to be characterized 
further to determine whether they fall into the RFHV/KSHV 
subfamily. In a second variation, the amplification is primed 
with RFHV or KSHV specific Type 3 oligonucleotides, such 
as those listed in Table 7, or other Glycoprotein B poly- 
nucleotide segments taken from these viruses. The amplifi- 
cation is conducted under conditions of medium to low 
stringency, so that the oligonucleotides will cross-hybridize 
with related species of viruses. In a more preferred variation, 
the amplification reaction is primed using RFHV/KSHV 
subfamily specific Type 2 oligonucleotides, such as those 
listed in Table 6. Under appropriate hybridization 
conditions, these primers will preferentially amplify Glyco- 
protein B from herpes viruses in the subfamily. 

Preferred members of the subfamily detected using a 
Glycoprotein B polynucleotide probe are those that are at 
least 65% identical with the RFHV or KSHV Glycoprotein 
B nucleotide sequence between residues 36 and 354 of SEQ. 
ID NO:l or SEQ. ID NO:3. More preferred are those that are 
at least about 67% identical; more preferred are those at least 
about 70% identical; more preferred are those that are at 
least about 80% identical; even more preferred are those 
about 90% identical or more. 

Members of the subfamily can also be identified by 
performing a hybridization assay on the polynucleotide of 
the sample, using a suitable probe. The polynucleotide to be 
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tested may optionally be amplified before conducting the 
hybridization assay, such as by using Type 1 or Type 2 
oligonucleotides as primers. The target is then tested in a 
hybridization reaction with a suitable labeled probe. The 
probe preferably comprises at least 21 nucleotides, prefer- 5 
ably at least about 25 nucleotides, more preferably at least 
about 50 nucleotides contained the RFHV or KSHV Gly- 
coprotein B sequence in SEQ. ID NOS:l and 3. Even more 
preferably, the probe comprises nucleotides 36—354 of SEQ. 
ID NOS: 1 or 3. Other preferred probes include Type 2 10 
oligonucleotides, such as those shown in Table 6. Hybrid- 
ization conditions are selected to permit the probe to hybrid- 
ize with Glycoprotein B polynucleotide sequences from the 
RFHV/KSHV subfamily, but not previously sequenced her- 
pes viruses; particularly sHVl, bHV4, eHV2, mHV68, 15 
hEBV, hCMV, hHV6, hVZV, and HSV1. Formation of a 
stable duplex with the test polynucleotide under these con- 
ditions suggests the presence of a polynucleotide in the 
sample derived from a member of the RFHV/KSHV sub- 
family. 20 

Members of the subfamily can also be identified by using 
a Class II antibody, the preparation of which was outlined 
earlier. A Class II antibody cross-reacts between antigens 
produced by members of the RFHV/KSHV subfamily, but 
not with other antigens, including those produced by herpes 25 
viruses not members of the subfamily. The test for new 
subfamily memers is performed, for example, by using the 
antibodies in an immunohistochemistry study of tissue sec- 
tions prepared from individuals with the conditions listed 
above. Positive staining of a tissue section with the antibody 30 
suggests the presence of Glycoprotein B in the sample from 
a member of the RFHV/KSHV subfamily, probably because 
the tissue is infected with the virus. If, in addition, the tissue 
section is non-reactive with RFHV and KSHV specific Class 
III antibodies, the Glycoprotein B in the tissue may be 35 
derived from another member of the subfamily. Similarly, if 
Class II antibodies are found in the circulation of an 
individual, the individual may have been subject to a present 
or past infection with a member of the RFHV/KSHV sub- 
family. 40 

Once a putative new virus is identified by any of the 
aforementioned methods, its membership in the RFHV/ 
KSHV subfamily may be confirmed by obtaining and 
sequencing a region of the Glycoprotein B gene of the virus, 
and comparing it with that of RFHV or KSHV according to 45 
the subfamily definition. For new members of the RFHV/ 
KSHV subfamily, other embodiments of this invention may 
be brought into play for purposes of detection, diagnosis, 
and pharmaceutical development. Adaptation of the embodi- 
ments of the invention for a new subfamily member, if 50 
required, is expected to be minor in nature, and will be 
obvious based on the new sequence data, or a matter of 
routine adjustment. 

Altered forms of Glycoprotein B from the RFHV/KSHV 
subfamily 55 

This invention also embodies altered forms of Glycopro- 
tein B of the RFHV/KSHV subfamily. 

A number of studies on both naturally occurring and 
induced mutations of the Glycoprotein B of HSV1 and 
hCMV point to a role of certain regions of the molecule for 60 
its the various biochemical functions. See, for example, 
Reschke et al. and Baghian et al. for a role of carboxy- 
terminal amino acids in fusion; Shiu et al. and Pellett et al. 
for epitopes for neutralizing antibodies; Gage et al. for 
regions of the molecule involved in syncytium formation; 65 
Navarro et al. (1992) for regions involved in virus penetra- 
tion and cell-to-cell spread; Quadri et al. and Novarro et al. 
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(1991) for regions involved in intracellular transport of 
Glycoprotein B during biosynthesis. 

Some of the residues described may be conserved 
between the Glycoprotein B molecules of the viruses inves- 
tigated previously, and the Glycoprotein B molecules 
described herein. By analogy, mutation of the same residue 
in the Glycoprotein B of the RFHV/KSHV subfamily is 
expected to have a similar effect as described for other 
viruses. Alternatively, functional regions of different Gly- 
coprotein B molecules may be combined to produce Gly- 
coprotein B recombinants with altered function. For 
example, replacing the Glycoprotein B gene in a pathogenic 
virus with that of a no n -pathogenic virus may reduce the 
pathogenicity of the recombinant (Kostal et al.). Either 
mutation and recombination of Glycoprotein B of the 
RFHV/KSHV herpes virus subfamily may lead to attenuated 
strains, in which either the infectivity, replication activity, or 
pathogenicity is reduced. Alterations in the Glycoprotein B 
sequence which have these effects are contemplated in this 
invention. 

Attenuated strains of herpes viruses are useful, for 
example, in developing polyvalent vaccines. It is desirable, 
especially in developing countries, to provide prophylactic 
vaccines capable of stimulating the immune system against 
several potential pathogens simultaneously. Viruses that are 
engineered to express immunogenic peptides of several 
different pathogens may accomplish this purpose. Herpes 
viruses may be especially suitable vectors, because the large 
genome may easily accommodate several kilobases of extra 
DNA encoding the peptides. Ideally, the viral vector is 
sufficiently intact to exhibit some biological activity and 
attract the attention of the host's immune system, while at 
the same time being sufficiently attenuated not to cause 
significant pathology. Thus, an attenuated virus of the 
RFHV/KSHV subfamily may be useful as a vaccine against 
like virulent forms, and may be modified to express addi- 
tional peptides and extend the range of immune protection. 

Another use for attenuated forms of herpes viruses is as 
delivery vehicles for gene therapy (Latchman et al., Glorioso 
et al.). In order to be effective, polynucleotides in gene 
therapy must be delivered to the target tissue site. In the 
treatment of fibrotic diseases, malignancies and related 
conditions, attenuated viral vectors of the RFHV/KSHV 
subfamily may be preferable over other targeting 
mechanisms, including other herpes viruses, since they have 
the means by which to target towards the affected tissues. In 
this embodiment, the virus is first attenuated, and then 
modified to contain the polynucleotide that is desired for 
gene therapy, such as those that are outlined in a previous 
section. 

Glycoprotein B in RFHV/KSHV subfamily vaccines 

Because of its prominence on the envelope of the infec- 
tious virus and infected cells, glycoprotein B is predicted to 
be a useful target for immune effectors. Herpes virus Gly- 
coprotein B is generally immunogenic, giving rise to anti- 
bodies capable of neutralizing the virus and preventing it 
from entering a replicative phase. In addition, Glycoprotein 
B is capable of eliciting a T-cell response, which may help 
eradicate an ongoing viral infection by attacking sites of 
viral replication in host cells. 

This invention embodies vaccine compositions and meth- 
ods for using them in the prevention and management of 
infection by viruses from the RFHV/KSHV subfamily. 

One series of embodiments relate to active vaccines. 
These compositions are designed to stimulate an immune 
response in the individual being treated against Glycoprotein 
B. They generally comprise either the Glycoprotein B 
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molecule, an immunogenic fragment or variant thereof, or a 
cell or particle capable of expressing the Glycoprotein B 
molecule. Alternatively, they may comprise a polynucle- 
otide encoding an immunogenic Glycoprotein B fragment 
(Horn et al.), preferably in the form of an expression vector. 5 
Polynucleotide vaccines may optionally comprise a delivery 
vehicle like a liposome or viral vector particle, or may be 
administered as naked DNA. 

Vaccine compositions of this invention are designed in 
such a way that the immunogenic fragment is presented to 
stimulate the proliferation and/or biological function of the 
appropriate immune cell type. Compositions directed at 
eliciting an antibody response comprise or encode B cell 
epitopes, and may also comprise or encode other elements 
that enhance uptake and display by antigen-presentation 
cells, or that recruit T cell help. Compositions directed at 15 
eliciting helper T cells, especially CD4 + cells, generally 
comprise T cell epitopes that can be presented in the context 
of class II histocompatibility molecules. Compositions 
directed at stimulating cytotoxic T cells and their precursors, 
especially CD8 + cells, generally comprise T cell epitopes 20 
that can be presented in the context of class I histocompat- 
ibility molecules. 

In the protection of an individual against a future expo- 
sure with herpes virus, an antibody response may be suffi- 
cient. Prophylactic compositions preferably comprise com- 2 5 
ponents that elicit a B cell response. Successful eradication 
of an ongoing herpes virus infection may involve the par- 
ticipation of cytotoxic T cells, T helper-inducer cells, or 
both. Infections for treating ongoing infection preferably 
comprise components capable of eliciting both T helper cells 3Q 
and cytotoxic T cells. Compositions that preferentially 
stimulate Type 1 helper (T^) cells over Type 2 helper (T^ 2 ) 
cells are even more preferred. The preparation and testing of 
suitable compositions for active vaccines is outlined in the 
sections that follow. 

Another series of embodiments relates to passive vaccines 35 
and other materials for adoptive transfer. These composi- 
tions generally comprise specific immune components 
against Glycoprotein B that are immediately ready to par- 
ticipate in viral neutralization or eradication. Therapeutic 
methods using these compositions are preferred to prevent 40 
pathologic consequences of a recent viral exposure. They are 
also preferred in immunocompromized individuals inca- 
pable of mounting a sufficient immune response to an active 
vaccine. Such individuals include those with congenital 
immunodeficiencies, acquired immunodeficiencies (such as 45 
those infected with HIV or on kidney dialysis), and those on 
immunosuppressive therapies, for example, with corticos- 
teroids. 

Suitable materials for adoptive transfer include specific 
antibody against Glycoprotein B, as described below. Also 50 
included are the adoptive transfer of immune cells. For 
example, T cells reactive against Glycoprotein B may be 
taken from a donor individual, optionally cloned or cultured 
in vitro, and then transferred to a histocompatible recipient. 
More preferably, the transferred cells are autologous to the 55 
recipient, and stimulated in vitro. Thus, T cells are purified 
from the individual to be treated, cultured in the presence of 
immunogenic components of Glycoprotein B and suitable 
stimulatory factors to elicit virus-specific cells, and then 
re administered. 60 

Certain compositions embodied herein may have proper- 
ties of both active and passive vaccines. For example, 
Glycoprotein B antibody given by adoptive transfer may 
confer immediate protection against herpes virus, and may 
also stimulate an ongoing response, through an anti-idiotype 65 
network, or by enhancing the immune presentation of viral 
antigen. 



Vaccines comprising Glycoprotein B polypeptides 

Specific components of vaccines to stimulate an immune 
response against Glycoprotein B include the intact Glyco- 
protein B molecule, and fragments of Glycoprotein B that 
are immunogenic in the host. 

Intact Glycoprotein B and longer fragments thereof may 
be prepared by any of the methods described earlier, espe- 
cially purification from a suitable expression vector com- 
prising a Glycoprotein B encoding polynucleotide. Isolated 
Glycoprotein B from other viral strains stimulate a protec- 
tive immune response (See U.S. Pat. No. 5,171,568: Burke 
et al.). Preferred fragments comprise regions of the molecule 
exposed on the outside of the intact viral envelope; located 
within about 650 amino acids of the N-terminal of the 
mature protein. 

Glycosylation of Glycoprotein B is not required for 
imnmunogenicity (CTDonnell et al.). Hence, glycosylated 
and unglycosylated forms of the molecule are equally pre- 
ferred. Glycosylation may be determined by standard tech- 
niques; for example, comparing the mobility of the protein 
in SDS polyacrylamide gel electrophoresis before and after 
treating with commercially available endoglycosidase type 
F or H. 

Smaller fragments of 5—50 amino acids comprising par- 
ticular epitopes of Glycoprotein B are also suitable vaccine 
components. These may be prepared by any of the methods 
described earlier; most conveniently, by chemical synthesis. 
Preferred fragments are those which are immunogenic and 
expressed on the outside of the viral envelope. Even more 
preferred are fragments implicated in a biological function 
of Glycoprotein B, such as binding to cell surface receptors 
or penetration of the virus into a target cell. 

Immunogenicity of various epitopes may be predicted by 
algorithms known in the art. Antigenic regions for B cell 
receptors may be determined, for example, by identifying 
regions of variable polarity (Hopp et al., see Example 9). 
Antigenic regions for T cell receptors may be determined, 
for example, by identifying regions capable of forming an 
amphipathic helix in the presentation groove of a histocom- 
patibility molecule. Antigenic regions may also be identified 
by analogy with Glycoprotein B molecules of other viral 
species. See, e.g., Sanchez-Pescador et al. and Mester et al., 
for B cell epitopes of HSV1; Liu et al. for HLA-restricted 
helper T cell epitopes of hCMV; and Hanke et al. for 
cytotoxic T lymphocyte epitopes of HSV1. 

Immunogenicity of various epitopes may be measured 
experimentally by a number of different techniques. 
Generally, these involve preparing protein fragments of 
5—20 amino acids in length comprising potential antigenic 
regions, and testing them in a specific bio assay. Fragments 
may be prepared by CNBr and/or proteolytic degradation of 
a larger segment of Glycoprotein B, and purified, for 
example, by gel electrophoresis and blotting onto nitrocel- 
lulose (Demotz et al.). Fragments may also be prepared by 
standard peptide synthesis (Schumacher et al., Liu et al.). In 
a preferred method, consecutive peptides of 12 amino acids 
overlapping by 8 residues are synthesized according to the 
entire extracellular domain of the mature Glycoprotein B 
molecule, using F-Moc chemistry on a nylon membrane 
support (see Example 11). 

Reactivity against the prepared fragment can then be 
determined in samples from individuals exposed to the intact 
virus or a Glycoprotein B component. The individual may 
have been experimentally exposed to the Glycoprotein B 
component by deliberate administration. Alternatively, the 
individual may have a naturally occurring viral infection, 
preferably confirmed by a positive amplification reaction 
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using a virus-specific oligonucleotide probe to Glycoprotein 
B or DNA Polymerase. Blood samples are obtained from the 
individual, and used to prepare serum, T cells, and peripheral 
blood mononuclear cells (PBMC) by standard techniques. 

Serum may be tested for the presence of Glycoprotein B 5 
specific antibody in an enzyme-linked immunosorbant 
assay. For example, peptides attached to a solid support such 
as a nylon membrane are incubated with the serum, washed, 
incubated with an enzyme -linked anti-immunoglobulin, and 
developed using an enzyme substrate. The presence of 10 
antibody against a particular Glycoprotein B peptide is 
indicated by a higher level of reaction product in the test 
well than in a well containing an unrelated peptide (Example 

ii). 

Lymphocyte preparations may be tested for the presence 15 
of Glycoprotein B specific helper T cells in a proliferation 
assay. Approximately 2xl0 4 helper T cells are incubated 
with the peptide at 10~ 4 — 10" 6 M in the presence of irradiated 
autologous or irradiated 10 5 PBMC as antigen presenting 
cells for about 3 days. [ 3 H]Thymidine is added for about the 20 
last 16 h of culture. The cells are then harvested and washed. 
Radioactivity in the washed cells at a level of about 10 fold 
over those cultured in the absence of peptide reflects pro- 
liferation of T cells specific for the peptide (Liu et al.). If 
desired, cells with a CD3 + 4 + 8~ phenotype may be cloned for 25 
further characterization of the helper T cell response. 

Lymphocyte preparations may be tested for the presence 
of Glycoprotein B specific cytotoxic T cells in a 51 Cr release 
assay. Targets are prepared by infecting allogeneic cells with 
a herpes virus comprising an expressible Glycoprotein B 30 
gene. Alternatively, allogeneic cells transfected with a Gly- 
coprotein B expression vector may be used. The targets are 
incubated with 51 Cr for about 90 min at 37° C. and then 
washed. About 5xl0 4 target cells are incubated with 
10" 4 -10- 5 M of the peptide and 0.1-2xl0 4 test T cells for 35 
about 30 min at 37° C. Radioactivity released into the 
supernatant at a level substantially above that due to spon- 
taneous lysis reflects CTL activity. If desired, cells with a 
CD3 + 4~8 + phenotype may be cloned for further character- 
ization of the CTL response. 40 

Glycoprotein B peptides may optionally be combined in 
a vaccine with other peptides of the same virus. Suitable 
peptides include peptides of any of the other components of 
the herpes virus, such as Glycoproteins C, D, H, E, I, J, and 
G. Glycoprotein B peptides may also optionally be com- 45 
bined with immunogenic peptides from different viruses to 
provide a multivalent vaccine against more than one patho- 
genic organism. Peptides may be combined by preparing a 
mixture of the peptides in solution, or by synthesizing a 
fusion protein in which the various peptide components are 50 
linked. 

Forms of Glycoprotein B comprising suitable epitopes 
may optionally be treated chemically to enhance their 
immunogenicity, especially if they comprise 100 amino 
acids or less. Such treatment may include cross-linking, for 55 
example, with glutaraldehyde; linking to a protein carrier, 
such as keyhole limpet hemocyanin (KLH) or tetanus tox- 
oid. 

The peptide or peptide mixture may be used neat, but 
normally will be combined with a physiologically and 60 
pharmacologically acceptable excipient, such as water, 
saline, physiologically buffered saline, or sugar solution. 

In a preferred embodiment, an active vaccine also com- 
prises an adjuvant which enhances presentation of the 
immunogen or otherwise accentuates the immune response 65 
against the immunogen. Suitable adjuvants include alum, 
aluminum hydroxide, beta-2 microglobulin (WO 91/16924: 
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Rock et al.), muramyl dipep tides, muramyl tripeptides (U.S. 
Pat. No. 5,171,568: Burke et al.), and monophosphoryl lipid 
A (U.S. Pat. No. 4,436,728: Ribi et al.; and WO 92/16231: 
Franco tte et al.). Immunomodulators such as Interleukin 2 
may also be present. The peptide and other components (if 
present) are optionally encapsulated in a liposome or micro- 
sphere. For an outline of the experimental testing of various 
adjuvants, see U.S. Pat. No. 5,171,568 (Burke et al.). A 
variety of adjuvants may be efficacious. The choice of an 
adjuvant will depend at least in part on the stability of the 
vaccine in the presence of the adjuvant, the route of 
administration, and the regulatory acceptability of the 
adjuvant, particularly when intended for human use. 

Polypeptide vaccines generally have a broad range of 
effective latitude. The usual route of administration is 
intramuscular, but preparations may also be developed 
which are effective given by other routes, including 
intravenous, intraperitoneal, oral, intranasal, and by inhala- 
tion. The total amount of Glycoprotein B polypeptide per 
dose of vaccine when given intramuscularly will generally 
be about 10 jug to 5 mg; usually about 50 jug to 2 mg; and 
more usually about 100 to 500 jug. The vaccine is preferably 
administered first as a priming dose, and then again as a 
boosting dose, usually at least four weeks later. Further 
boosting doses may be given to enhance or rejuvenate the 
response on a periodic basis. 

Vaccines comprising viral particles expressing Glycoprotein 
B 

Active vaccines may also be prepared as particles that 
express an immunogenic epitope of Glycoprotein B. 

One such vaccine comprises the L-particle of a recombi- 
nant herpes virus (see U.S. Pat. No. 5,284,122: Cunningham 
et al.). The genome of the recombinant virus is defective in 
a capsid component, or otherwise prevented from forming 
intact virus; however, it retains the ability to make 
L-p articles. The genome is engineered to include a Glyco- 
protein B encoding polynucleotide of the present invention 
operatively linked to the controlling elements of the recom- 
binant virus. The virus is then grown, for example, in 
cultured cells, and the particles are purified by centrifugation 
on a suitable gradient, such as FICOLL™. Such prepara- 
tions are free of infective virus, and capable of expressing 
peptide components of a number of different desirable 
epitopes. 

Another such vaccine comprises a live virus that 
expresses Glycoprotein B of the present invention as a 
heterologous antigen. Such viruses include HIV, SIV, FIV, 
equine infectious anemia, visna virus, and herpes viruses of 
other species. The virus should be naturally non-pathogenic 
in the species to be treated; or alternatively, it should be 
attenuated by genetic modification, for example, to reduce 
replication or virulence. Herpes virus may be attenuated by 
mutation of a gene involved in replication, such as the DNA 
Polymerase gene. Herpes virus may also be attenuated by 
deletion of an essential late-stage component, such as Gly- 
coprotein H (WO 92/05263: Inglis et al.). Alive vaccine may 
be capable of a low level of replication in the host, particu- 
larly if this enhances protein expression, but not to the extent 
that it causes any pathological manifestation in the subject 
being treated. 

A preferred viral species for preparing a live vaccine is 
adenovirus. For human therapy, human adenovirus types 4 
and 7 have been shown to have no adverse affects, and are 
suitable for use as vectors. Accordingly, a Glycoprotein B 
polynucleotide of the present invention may be engineered, 
for example, into the El or E3 region of the viral genome. 
It is known that adenovirus vectors expressing Glycoprotein 
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B from HSV1 or HS V2 stimulate the production of high titer 
virus-neutralizing antibody (McDermott et al.). The 
response protects experimental animals against a lethal 
challenge with the respective live virus. 

Also preferred as a virus for a live recombinant vaccine is 5 
a recombinant pox virus, especially vaccinia. Even more 
preferred are strains of vaccinia virus which have been 
modified to inactivate a non-essential virulence factor, for 
example, by deletion or insertion of an open reading frame 
relating to the factor (U.S. Pat. No. 5,364,773: Paoletti et 
al.). To prepare the vaccine, a Glycoprotein B encoding 
polynucleotide of the present invention is genetically engi- 
neered into the viral genome and expressed under control of 
a vaccinia virus promoter. Recombinants of this type may be 
used directly for vaccination at about 10 7 — 10 s plaque - 
forming units per dose. Single doses may be sufficient to 15 
stimulate an antibody response. Vaccinia virus recombinants 
comprising Glycoprotein B of HSV1 are effective in pro- 
tecting mice against lethal HSV1 infection (Cantin et aL). 

Another vaccine in this category is a self- assembling 
replication-defective hybrid virus. See, for example, WO 20 
92/05263 (Inglis et al.). The particle may contain, for 
example, capsid and envelope glycoproteins, but not an 
intact viral genome. As embodied in this invention, one of 
the glycoproteins in the viral envelope is Glycoprotein B. 

In a preferred embodiment, the particle is produced by a 25 
viral vector of a first species, having a sufficient segment of 
the genome of that species to replicate, along with encoding 
regions for a capsid and an envelope from a heterologous 
species (see U.S. Pat. No. 5,420,026: Payne). Genetic ele- 
ments of the first species are selected such that infection of 30 
eukaryotic cells with the vector produces capsid and enve- 
lope glycoproteins that self-assemble into replication- 
defective particles. In a variant of this embodiment, poly- 
nucleotides encoding the capsid and envelope glycoproteins 
are provided in two separate vectors derived from the first 35 
viral species. The capsid encoding regions may be derived 
from a lentivirus, such as HIV, SIV, FIV, equine infectious 
anemia virus, or visna virus. The envelope encoding regions 
comprise a Glycoprotein B encoding polynucleotide of the 
present invention. Preferably, all envelope components are 40 
encoded by a herpes virus, particularly of the RFHV/KSHV 
subfamily. The defective viral particles are obtained by 
infecting a susceptible eukaryotic cell line such as BSC-40 
with the vector(s) and harvesting the supernatant after about 
18 hours. Viral particles may be further purified, if desired, 45 
by centrifugation through a sucrose cushion. Particles may 
also be treated with 0.8% formalin at 40° C. for 24 hours 
prior to administration as a vaccine. 

Vaccines comprising a live attenuated virus or virus 
analog may be lyophilized for refrigeration. Diluents may 50 
optionally include tissue culture medium, sorbitol, gelatin, 
sodium bicarbonate, albumin, gelatin, saline solution, phos- 
phate buffer, and sterile water. Other active components may 
optionally be added, such as attenuated strains of measles, 
mumps, and rubella, to produce a polyvalent vaccine. The 55 
suspension may be lyophilized, for example, by the gas 
injection technique. This is performed by placing vials of 
vaccine in a lyophilizing chamber precooled to about -45° 
C. with 10—18 Pa of dry sterile argon, raising the tempera- 
ture about 5—25° C. per h to +30° C, conducting a second 60 
lyophilizing cycle with fall vacuum, and then sealing the 
vials under argon in the usual fashion (see EP 0290197B1: 
Mcaleer et al.). For vaccines comprising live herpes virus, 
the final lyophilized preparation will preferably contain 
2—8% moisture. 65 

It is recognized that a number of alternative compositions 
for active vaccines, not limited to those described here in 



detail, may be efficacious in eliciting specific B- and T-cell 
immunity. All such compositions are embodied in the spirit 
of the present invention, providing they include a RFHV/ 
KSHV subfamily Glycoprotein B polynucleotide or 
polypeptide as an active ingredient. 
Vaccines comprising Glycoprotein B antibodies 

Antibody against Glycoprotein B of the RFHV/KSHV 
subfamily may be administered by adoptive transfer to 
immediately confer a level of humoral immunity in the 
treated subject. Passively administered anti-glycoprotein B 
experimentally protects against a lethal challenge with other 
herpes viruses, even in subjects with compromised T-cell 
immunity (Eis-Hubinger et al.). 

The antibody molecule used should be specific for Gly- 
coprotein B against which protection is desired. It should not 
cross-reactive with other antigens, particularly endogenous 
antigens of the subject to be treated. The antibody may be 
specific for the entire RFHV/KSHV subfamily (Class II 
antibodies), or for a particular virus species (Class III 
antibodies), depending on the objective of the treatment. 
Preferably, the antibody will have an overall affinity for a 
polyvalent antigen of at least about 10 s M -1 ; more prefer- 
ably it will be at least about 10 10 M -1 ; more preferably it will 
be at least about 10 12 M -1 ; even more preferably, it will be 
10 13 M -1 or more. Intact antibody molecules, recombinants, 
fusion proteins, or antibody fragments may be used; 
however, intact antibody molecules or recombinants able to 
express natural antibody effector functions are preferred. 
Relevant effector functions include but are not limited to 
virus aggregation; antibody-dependent cellular cytotoxicity; 
complement activation; and opsonization. 

Antibody may be prepared according to the description 
provided in an earlier section. For systemic protection, the 
antibody is preferably monomeric, and preferably of the IgG 
class. For mucosal protection, the antibody may be 
polymeric, preferably of the IgA class. The antibody may be 
either monoclonal or polyclonal; typically, a cocktail of 
monoclonal antibodies is preferred. It is also preferred that 
the preparation be substantially pure of other biological 
components from the original antibody source. Other anti- 
body molecules of desired reactivity, and carriers or stabi- 
lizers may be added after purification. 

In some instances, it is desirable that the antibody 
resemble as closely as possible an antibody of the species 
which is to be treated. This is to prevent the administered 
antibody from becoming itself a target of the recipient's 
immune response. Antibodies of this type are especially 
desirable when the subject has an active immune system, or 
when the antibodies are to be administered in repeat doses. 

Accordingly, this invention embodies anti- Glycoprotein 
B antibody which is human, or which has been humanized. 
Polyclonal human antibody may be purified from the sera of 
human individuals previously infected with the respective 
RFHV/KSHV subfamily herpes virus, or from volunteers 
administered with an active vaccine. Monoclonal human 
antibody may be produced from the lymphocytes of such 
individuals, obtained, for example, from peripheral blood. In 
general, human hybridomas may be generated according to 
the methods outlined earlier. Usually, the production of 
stable human hybridomas will require a combination of 
manipulative techniques, such as both fusion with a human 
myeloma cell line and transformation, for example, with 
EBV. 

In a preferred method, human antibody is produced from 
a chimeric non-primate animal with functional human 
immunoglobulin loci (WO 91/10741: Jakobovits et al.). The 
non-primate animal strain (typically a mouse) is incapable of 
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expressing endogenous immunoglobulin heavy chain, and 
optimally at least one endogenous immunoglobulin light 
chain. The animals are genetically engineered to express 
human heavy chain, and optimally also a human light chain. 
These animals are immunized with a Glycoprotein B of the 5 
RFHV/KSHV subfamily of herpes viruses. Their sera can 
then be used to prepare polyclonal antibody, and their 
lymphocytes can be used to prepare hybridomas in the usual 
fashion. After appropriate selection and purification, the 
resultant antibody is a human antibody with the desired 10 
specificity. 

In another preferred method, a monoclonal antibody with 
the desired specificity for Glycoprotein B is first developed 
in another species, such as a mouse, and then humanized. To 
humanize the antibody, the polynucleotide encoding the 15 
specific antibody is isolated, antigen binding regions are 
obtained, and then recombined with polynucleotides encod- 
ing elements of a human immunoglobulin of unrelated 
specificity. Alternatively, the nucleotide sequence of the 
specific antibody is obtained and used to design a related 20 
sequence with human characteristics, which can be 
prepared, for example, by chemical synthesis. The heavy 
chain constant region or the light chain constant region of 
the specific antibody, preferably both, are substituted with 
the constant regions of a human immunoglobulin of the 25 
desired class. Preferably, segments of the variable region of 
both chains outside the complementarity determining 
regions (CDR) are also substituted with their human equiva- 
lents (EP 0329400: Winter). 

Even more preferably, segments of the variable region are 30 
substituted with their human equivalents, providing they are 
not involved either in antigen binding or maintaining the 
structure of the binding site. Important amino acids may be 
identified, for example, as described by Padlan. In one 
particular technique (WO 94/11509: Couto et al.), a posi- 35 
tional consensus sequence is developed using sequence and 
crystallography data of known immunoglobulins. The amino 
acid sequence of the Glycoprotein B specific antibody is 
compared with the model sequence, and amino acids 
involved in antigen binding, contact with CDR's, or contact 40 
with opposing chains are identified. The other amino acids 
are altered, where necessary, to make them conform to a 
consensus of human immunoglobulin sequences. A poly- 
nucleotide encoding the humanized sequence is then 
prepared, transfected into a host cell, and used to produce 45 
humanized antibody with the same Glycoprotein B speci- 
ficity as the originally obtained antibody clone. 

Specific antibody obtained using any of these methods is 
generally sterilized, mixed with a pharmaceutically compat- 
ible excipient. Stabilizers such as 0.3 molar glycine, and 50 
preservatives such as 1:10,000 thimerosal, may also be 
present. The suspension may be buffered to neutral pH 
(—7.0), for example, by sodium carbonate. The potency may 
optionally be adjusted by the addition of normal human IgG, 
obtained from large pools of normal plasma, for example, by 55 
the Cohn cold ethanol fractionation procedure. Other 
diluents, such as albumin solution, may be used as an 
alternative. The concentration is adjusted so that a single 
dose administration constitutes 0.005—0.2 mg/kg, preferably 
about 0.06 mg/kg. A single dose preferably results in a 60 
circulating level of anti- Glycoprotein B, as detected by 
ELISAor other suitable technique, which are comparable to 
those observed in individuals who have received an active 
Glycoprotein B vaccine or have recovered from an acute 
infection with the corresponding virus, or which are known 65 
from experimental work to be protective against challenges 
with a pathologic dose of virus. 
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Administration should generally be performed by intra- 
muscular injection, not intravenously, and care should be 
taken to assure that the needle is not in a blood vessel. 
Special care should be taken with individuals who have a 
history of systemic allergic reactions following administra- 
tion of human globulin. For prophylactic applications, the 
antibody preparation may optionally be administered in 
combination with an active vaccine for Glycoprotein B, as 
described in the preceding sections. For post-exposure 
applications, the antibody preparation is preferably admin- 
istered within one week of the exposure, more preferably 
within 24 hours, or as soon as possible after the exposure. 
Subsequent doses may optionally be given at approximately 
3 month intervals. 

As for all therapeutic instruments described herein, the 
amount of composition to be used, and the appropriate route 
and schedule of administration, will depend on the clinical 
status and requirements of the particular individual being 
treated. The choice of a particular regimen is ultimately the 
responsibility of the prescribing physician or veterinarian. 

The foregoing description provides, inter alia, a detailed 
explanation of how Glycoprotein B encoding regions of 
herpes viruses, particularly those of the RFHV/KSHV 
subfamily, can be identified and their sequences obtained. 
Polynucleotide sequences for encoding regions of Glyco- 
protein B of both RFHV and KSHV are provided. 

The polynucleotide sequences listed herein for RFHV and 
KSHV are believed to be an accurate rendition of the 
sequences contained in the polynucleotides from the herpes 
viruses in the tissue samples used for this study. They 
represent a consensus of sequence data obtained from mul- 
tiple clones. However, it is recognized that sequences 
obtained by amplification methods such as PCR may com- 
prise occasional errors in the sequence as a result of ampli- 
fication. The error rate is estimated to be between about 
0.44% and 0.75% for single determinations; about the same 
rate divided by V(n-1) for the consensus of n different 
determinations. Nevertheless, the error rate may be as high 
as 1% or more. Sequences free of amplification errors can be 
obtained by creating a library of herpes virus polynucleotide 
sequences, using oligonucleotides such as those provided in 
Table 7 to select relevant clones, and sequencing the DNA 
in the selected clones. The relevant methodology is well 
known to a practitioner of ordinary skill in the art, who may 
also wish to refer to the description given in the Example 
section that follows. 

It is recognized that allelic variants and escape mutants of 
herpes viruses occur. Polynucleotides and polypeptides may 
be isolated or derived that incorporate mutations, either 
naturally occurring, or accidentally or deliberately induced, 
without departing from the spirit of this invention. 

The examples presented below are provided as a further 
guide to a practitioner of ordinary skill in the art, and are not 
meant to be limiting in any way. 

EXAMPLES 

Example 1 

Oligonucleotide Primers for Herpes Virus 
Glycoprotein B 

Amino acid sequences of known herpes virus Glycopro- 
tein B molecules were obtained from the PIR protein 
database, or derived from DNA sequences obtained from the 
GenBank database. The sequences were aligned by 
computer-aided alignment programs and by hand. 

Results are shown in FIG. 3. sHVl, bHV4, mHV68, EBV 
and hHV6 sequences were used to identify regions that were 
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relatively well conserved, particularly amongst the gamma 
herpes viruses. Nine regions were chosen for design of 
amplification primers. The DNA sequences for these regions 
were then used to design the oligonucleotide primers. The 
primers were designed to have a degenerate segment of 8— 14 
base pairs at the 3' end, and a consensus segment of 18—30 
bases at the 5' end. This provides primers with optimal 
sensitivity and specificity. 

The degenerate segment extended across highly con- 
served regions of herpes virus Glycoprotein B sequences, 
encompassing the least number of alternative codons. The 
primers could therefore be synthesized with alternative 
nucleotide residues at the degenerate positions and yield a 
minimum number of combinations. There were no more 
than 256 alternative forms for each of the primers derived. 

The consensus segment was derived from the correspond- 
ing flanking region of the Glycoprotein B sequences. 
Generally, the consensus segment was derived by choosing 
the most frequently occurring nucleotide at each position of 
all the Glycoprotein B sequences analyzed. However, selec- 
tion was biased in favor of C or G nucleotides, to maximize 
the ability of the primers to form stable duplexes. 

Results are shown in FIGS. 4—12, and are summarized in 
Table 4. In a PCR, oligonucleotides listed in Table 4 as 
having a "sense" orientation would act as primers by hybrid- 
izing with the strand antisense to the coding strand, and 
initiating polymerization in the same direction as the Gly- 
coprotein B encoding sequence. Oligonucleotides listed in 
Table 4 as having an "antisense" orientation would hybridize 
with the coding strand and initiate polymerization in the 
direction opposite to that of the Glycoprotein B encoding 
sequence. 

Synthetic oligonucleotides according to the designed 
sequences were ordered and obtained from Oligos Etc, Inc. 

Example 2 

DNA Extraction 

Biopsy specimens were obtained from Kaposi's sarcoma 
lesions from human subjects diagnosed with AIDS. The 
specimens were fixed in paraformaldehyde and embedded in 
paraffin, which were processed for normal histological 
examination. 

Fragments of the paraffin samples were extracted with 
500 jUL of xylene in a 1.5 mL EPPENDORF™ conical 
centrifuge tube. The samples were rocked gently for 5 min 
at room temperature, and the tubes were centrifuged in an 
EPPENDORF™ bench-top centrifuge at 14,000 rpm for 5 
min. After removing the xylene with a Pasteur pipette, 500 
/iL of 95% ethanol was added, the sample was resuspended, 
and then re -centrifuged. The ethanol was removed, and the 
wash step was repeated. Samples were then air-dried for 
about 1 hour. 500 //L of proteinase-K buffer (0.5% 
TWEEN™ 20, a detergent; 50 mM Tris buffer pH 7.5, 50 
mM NaCl) and 5 //L of proteinase K (20 mg/mL) were 
added, and the sample was incubated for 3 h at 55° C. The 
proteinase K was inactivated by incubating at 95° C. for 10 
min. 

Samples of DNA from KS tissue were pooled to provide 
a consistent source of polynucleotide for the amplification 
reactions. This pool was known to contain DNA from 
KSHV, as detected by amplification of KSHV DNA poly- 
merase sequences, as described in commonly owned U.S. 
patent application Ser. No. 60/001,148. 

Example 3 

Obtaining Amplified Segments of KSHV 
Glycoprotein B 

The oligonucleotides obtained in Example 1 were used to 
amplify segments of the DNA extracted from KSHV tissue 
in Example 2, according to the following protocol. 
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A first PCR reaction was conducted using 2 //L of pooled 
DNA template, 1 juL of oligonucleotide FRFDA (50 pmol/ 
juL), 1 //L of oligonucleotide TVNCB (50 pmol/jwL), 10 juL 
of lOx buffer, 1 //L containing 2.5 mM of each of the 

5 deoxyribonucleotide triphosphates (dNTPs), 65 //L distilled 
water, and 65 //L mineral oil. The mixture was heated to 80° 
C. in a Perkin-Elmer (model 480) PCR machine. 0.5 //LTaq 
polymerase (BRL, 5 and 19.5 /iL water was then 

added. 35 cycles of amplification were conducted in the 

10 following sequence: 1 min at 94° C, 1 min at the annealing 
temperature, and 1 min at 72° C. The annealing temperature 
was 60° C. in the first cycle, and decreased by 2° C. each 
cycle until 50° C. was reached. The remaining cycles were 
performed using 50° C. as the annealing temperature. 

15 A second PCR reaction was conducted as follows: to 1 //L 
of the reaction mixture from the previous step was added 1 
juL oligonucleotide NIVPA (50 pmoV/iL), 1 //L oligonucle- 
otide TVNCB (50 pmol/^L), 10 /iL of lOx buffer, 1 /iL 
dNTPs, 66 /iL water, and 65 juL mineral oil. The mixture was 

20 heated to 80° C, and 0.5 juL Taq polymerase in 19.5 piL 
water was added. 35 cycles of amplification were conducted 
using the same temperature step -down procedure as before. 
The PCR product was analyzed by electrophoresing on a 2% 
agarose gel and staining with ethidium bromide. 

25 The two -round amplification procedure was performed 
using fourteen test buffers. Five buffers yielded PCR product 
of about the size predicted by analogy with other herpes 
sequences. These included WB4 buffer (lOx WB4 buffer is 
0.67 M Tris buffer pH 8.8, 40 mM MgCl 2 , 0.16 M (NH 4 ) 

30 2 S0 4 , 0.1 M p-mercaptoethanol, 1 mg/mL bovine serum 
albumin, which is diluted 1 to 10 in the reaction). Also tested 
was WB2 buffer (the same as WB4 buffer, except with 20 
mM MgCl 2 in the lOx concentrate). Also tested were buffers 
that contained 10 mM Tris pH 8.3, 3.5 mM MgCl 2 and 25 

35 mM KC1; or 10 mM Tris pH 8.3, 3.5 mM MbCl 2 and 75 mM 
KC1; or 10 mM Tris pH 8.8, 3.5 mM MgCl 2 and 75 mM 
KC1; when diluted to final reaction volume. The WB4 buffer 
showed the strongest band, and some additional fainter 
bands. This may have been due to a greater overall amount 

40 of labeled amplified polynucleotide in the WB4 sample. 

The product from amplification with WB2 buffer was 
selected for further investigation. A third round of amplifi- 
cation was performed to introduce a radiolabel. The last- 

45 used oligonucleotide (TVNCB) is end-labeled with gamma 
32 P-ATP, and 1 //L was added to 20 juL of the reaction 
mixture from the previous amplification step, along with 1 
juL2.5 mM dNTP. The mixture was heated to 80° C, and 0.5 
//LTaq polymerase was added. Amplification was conducted 

5Q through five cycles of 94° C, 60° C. and 72° C. The reaction 
was stopped using 8.8 //L of loading buffer from a Circum- 
vent sequencing gel kit. 

A ~4 juL aliquot of the radiolabeled reaction product was 
electrophoresed on a 6% polyacrylamide sequencing gel for 

55 1.5 h at 51° C. The gel was dried for 1.5 h, and an 
autoradiograph was generated by exposure for 12 h. Two 
bands were identified. The larger band had the size expected 
for the fragment from analogy with other gamma herpes 
virus sequences. 

60 The larger band was marked and cut out, and DNA was 
eluted by incubation in 40 //L TE buffer (10 mM Tris and 1 
mM EDTA, pH. 8.0). A further amplification reaction was 
performed on the extracted DNA, using 1 //L of the eluate, 
10 juL lOx WB2 butter, 1 juL 2.5 mM dNTP, 1 juL of each of 

65 the second set of oligonucleotide primers (NIVPA and 
TVNCB), and 65 juL water. The mixture was heated to 80° 
C, and 0.5 juL Taq polymerase in 19.5 //L water was added. 
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Amplification was conducted through 35 cycles, using the 
temperature step -down procedure described earlier. 

Example 4 

Sequence of the 386 Base Fragment of KSHV 5 
Glycoprotein B 

The amplified polynucleotide fragment from the Glyco- 
protein B gene of KSHV was purified and cloned according 
to the following procedure. 

40 jiiL of amplification product was run on a 2% agarose io 
gel, and stained using 0.125 jUg/mL ethydium bromide. The 
single band at about 400 base pairs was cut out, and purified 
using a QIAGEN™ II gel extraction kit, according to 
manufacturer's instructions. 

The purified PCR product was ligated into the pGEM™-t 15 
cloning vector. The vector was used to transform competent 
bacteria (E. coli JM-109). Bacterial clones containing the 
amplified DNA were picked and cultured. The bacteria were 
lysed and the DNA was extracted using phenol-chloroform 
followed by precipitation with ethanol. Colonies containing 
inserts of the correct size were used to obtain DNA for 20 
sequencing. The clone inserts were sequenced from both 
ends using vector-specific oligonucleotides (forward and 
reverse primers) with a SEQUENASE™ 7-deaza dGTP kit, 
according to manufacturer's directions. A consensus 

25 

sequence for the new fragment was obtained by combining 
sequence data obtained from 5 clones of one KSHV Gly- 
coprotein B amplification product. 

The length of the fragment in between the primer hybrid- 
izing regions was 319 base pairs. The nucleotide sequence is 3Q 
listed as SEQ. ID NO: 3 and shown in FIG. 1. The encoded 
polypeptide sequence is listed as SEQ. ID NO:4. 

FIG. 13 compares the sequence of this Glycoprotein B 
gene fragment with the corresponding sequence of other 
gamma herpes viruses. Single dots (.) indicate residues in 35 
other gamma herpes viruses that are identical to those of the 
KSHV sequence. Dashes (-) indicate positions where gaps 
have been added to provide optimal alignment of the 
encoded protein. The longest stretch of consecutive nucle- 
otides that is identical between the KSHV sequence and any 4Q 
of the other listed sequences is 14. Short conserved 
sequences are scattered throughout the fragment. Overall, 
the polynucleotide fragment is 63% identical between 
KSHV and the two closest herpes virus sequences, sHVl 
and bHV4. 

The sequence data was used to design Type 3 oligonucle- 
otide primers of 2040 base pairs in length. The primers were 
designed to hybridize preferentially with the KSHV Glyco- 
protein B polynucleotide, but not with other sequenced 
polynucleotides encoding Glycoprotein B. Example primers 50 
of this type were listed earlier in Table 7. 

FIG. 14 compares the predicted amino acid sequence 
encoded by the same Glycoprotein B gene fragment. At the 
amino acid level, two short segments are shared between 
KSHV and a previously known gamma herpes virus, bHV4. 55 
The first (SEQ. ID NO: 64) is 13 amino acids in length and 
located near the N-terminal end of the fragment. The second 
(SEQ. ID NO: 65) is 15 amino acids in length and located 
near the C-terminal end of the fragment. All other segments 
shared between KSHV and other gamma herpes viruses are 60 
9 amino acids or shorter. 

Example 5 

Sequence of the 386 Base Fragment of RFHV 

Glycoprotein B 65 

Tissue specimens were obtained from the tumor of a 
Macaque nemestrina monkey at the University of Washing- 
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ton Regional Primate Research Center. The specimens were 
fixed in paraformaldehyde and embedded in paraffin. DNA 
was extracted from the specimens according to the proce- 
dure of Example 2. 

The presence of RFHV polynucleotide in DNA prepara- 
tions was determined by conducting PCR amplification 
reactions using oligonucleotide primers hybridizing to the 
DNA polymerase gene. Details of this procedure are pro- 
vided in commonly owned U.S. patent application Ser. No. 
60/001,148. DNA extracts containing RFHV polynucleotide 
determined in this fashion were pooled for use in the present 
study. 

DNA preparations containing RFHV polynucleotide 
served as the template in PCR amplification reactions using 
Glycoprotein B consensus-degenerate oligonucleotides 
FRFDA and TVNCB, followed by a second round of ampli- 
fication using oligonucleotides NIVPA and TVNCB. Con- 
ditions were essentially the same as in Example 3, except 
that only WB4 buffer produced bands of substantial 
intensity, with the amount of DNA in the initial source and 
the conditions used. Labeling of the amplified DNA was 
performed with 32 P end-labeled NIVPA, as before; the 
product was electrophoresed on a 6% polyacrylamide gel, 
and an autoradiogram was obtained. A ladder of bands 
corresponding to about 386 base pairs and about 10 higher 
mol wt concatemers was observed. The 386 base pair band 
(with the same mobility as a simultaneously run KSHV 
fragment) was cut out of the gel and extracted. 

To determine whether the DNA in this extract was 
obtained from a specific amplification reaction, PCR's were 
set up using NIVPASQ alone, TVNCB SQ alone, or the two 
primers together. Buffer conditions were the same as for the 
initial amplification reactions. The mixture was heated to 
80° C, Taq polymerase was added, and the amplification 
was carried through 35 cycles using the temperature step- 
down procedure. Theoretically, specific amplification reac- 
tions accumulate product linearly when one primer is used, 
and exponentially when using two primers with opposite 
orientation. Thus, specificity is indicated by more product in 
the reaction using both primers, whereas equal product in all 
three mixtures suggests non-specific amplification. Ampli- 
fication products from these test reactions were analyzed on 
an agarose gel stained with ethidium bromide. The RF 
extract showed no product for the NIVPASQ reaction, a 
moderate staining band for the TVNCBSQ reaction at the 
appropriate mobility, and an intensely staining band for both 
primers together. For a KSHV fragment assayed in parallel, 
there was a faint band for the NIVPASQ reaction, no band 
for the TVNCBSQ reaction, and an intensely staining band 
for both primers together. We concluded that the 386 base 
pair band in the RF extract represented specific amplification 
product. 

Accordingly, 40 //L of the RF extract that had been 
amplified with both primers was run preparatively on a 2% 
agarose gel, and the —386 base pair band was cut out. 
Agarose was removed using a QIAGEN™ kit, and the 
product was cloned in E. coli and sequenced as in Example 
4. A consensus sequence was determined for 3 different 
clones obtained from the same amplified RFHV product. 

The polynucleotide sequence of RFHV Glycoprotein B 
fragment (SEQ. ID NO:l) is aligned in FIG. 1 with the 
corresponding sequence from KSHV. Also shown is the 
encoded RFHV amino acid sequence (SEQ. ID NO: 2). 
Between the primer hybridization regions (nucleotides 
36—354), the polynucleotide sequences are 76% identical; 
and the amino acid sequences are 91% identical. The inter- 



6,015 

69 

nal cysteine residue and the potential N-linked glycosylation 
site are both conserved between the two viruses. 

The sequence data was used to design Type 3 oligonucle- 
otide primers of 20—40 base pairs in length. The primers 
were designed to hybridize preferentially with the RFHV 5 
Glycoprotein B polynucleotide, but not with other 
sequenced polynucleotides encoding Glycoprotein B. 
Example primers of this type were listed earlier in Table 7. 

FIG. 15 compares the predicted amino acid sequence 
encoded by nucleotides 36—354 of the Glycoprotein B gene 10 
fragment. As for the KSHV sequence, two short segments 
are shared between RFHV and a previously known gamma 
herpes virus, bHV4. All other segments shared between 
RFHV and other gamma herpes viruses are shorter than 9 
amino acids in length. 15 

FIG. 16 is an alignment of sequence data for the same 
Glycoprotein B fragment in the spectrum of herpes viruses 
for which data is available. FIG. 17 shows the phylogenetic 
relationship between herpes viruses, based on the degree of 2Q 
identity across the partial Glycoprotein B amino acid 
sequences shown in FIG. 16. By amino acid homology, 
amongst the viruses shown, RFHV and KSHV are most 
closely related to bHV4, eHV2, and sHVl. 

25 

Example 6 

Oligonucleotide Primers and Probes for the RFHV/ 
KSHV Subfamily 

Based on the polynucleotide fragment obtained for RFHV 30 
and KSHV, seven Type 2 oligonucleotides were designed 
that could be used either as PCR primers or as hybridization 
probes with members of the RFHV/KSHV subfamily. 

Four consensus-degenerate Type 2 oligonucleotides, 
SHMDA, CFSSB, ENTFA, and DNIQB are shown in FIG. 35 
17, alongside the sequences they were derived from. Like 
the oligonucleotides of Example 1, they have a consensus 
segment towards the 5' end, and a degenerate segment 
towards the 3' end. However, these oligonucleotides are 
based only on the RFHV and KSHV sequences, and will 40 
therefore preferentially form stable duplexes with Glyco- 
protein B of the RFHV/KSHV subfamily. A list of exem- 
plary Type 2 oligonucleotides was provided earlier in Table 
6. 

Different Type 2 oligonucleotides have sense or antisense 45 
orientations. Primers with opposing orientations may be 
used together in PCR amplifications. Alternatively, any Type 
2 oligonucleotide may be used in combination with a Type 
1 oligonucleotide with an opposite orientation. ^ 

Example 7 

Upstream and Downstream Glycoprotein B 
Sequence 

55 

Further amplification reactions are conducted to obtain 
additional sequence data. The source for KSHV DNA is 
Kaposi 5 s Sarcoma tissue, either frozen tissue blocks or 
paraffin-embedded tissue, prepared according to Example 2, 
or cell lines developed from a cancer with a KSHV etiology, 60 
such as body cavity lymphoma. Also suitable is KSHV that 
is propagated in culture (Weiss et al.) 

The general strategy to obtain further sequence data in the 
5' direction of the coding strand is to conduct amplification 
reactions using the consensus-degenerate (Type 1) oligo- 65 
nucleotide hybridizing upstream from the fragment as the 5' 
primer, in combination with the closest virus-specific (Type 
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3) oligonucleotides as the 3' primers. Thus, a first series of 
amplification cycles are conducted, for example, using 
FRFDA and TNKYB as the first set of primers. This may 
optionally be followed by a second series of amplification 
cycles, conducted, for example, using FRFDA and GLTEB 
as a second set of primers. 

The conditions used are similar to those described in 
Examples 3 and 4. The reaction is performed in WB4 buffer, 
using the temperature step -down procedure described in 
Example 3. After two rounds of amplification, the product is 
labeled using the last-used virus-specific oligonucleotide 
(GLTEB, in this case), end-labeled with gamma 32 P-ATP. 
The labeled product is electrophoresed on 6% 
poly aery lamide, and a band corresponding to the appropriate 
size as predicted by analogy with other herpes viruses is 
excised. After re -amplification, the product is purified, 
cloned, and sequenced as before. A consensus sequence for 
the new fragment is obtained by combining results of about 
three determinations. 

In order to obtain further sequence data in the 3' direction 
of the coding strand, amplifications are conducted using 
consensus-degenerate (Type 1) oligonucleotides hybridizing 
downstream from the fragment as the 3' primer, in combi- 
nation with the closest virus-specific (Type 3) oligonucle- 
otides as the 5' primers. In one example, a first series of 
amplification cycles are conducted using NVFDB and 
TVFLA, optionally followed by a second series conducted 
using NVFDB and SQPVA. Amplification and sequencing is 
performed as before. The new sequence is used to design 
further Type 3 oligonucleotides with a sense orientation, 
which are used with other downstream-hybridizing Type 1 
oligonucleotides (such as FREYB and NVFDB) to obtain 
further sequence data. Alternatively, further sequence data in 
the 3' direction is obtained using Type 1 oligonucleotides 
with opposite orientation: for example, two primers are 
selected from the group of FRFDA, NIVPA, TVNCA, 
NIDFB, NVFDB, and FREYB; additional primers may be 
selected for nested amplification. 

To obtain sequence data 3' from the most downstream 
oligonucleotide primer, Type 1 primers such as CYSRA, or 
Type 3 primers such as TVFLA, may be used in combination 
with primers hybridizing towards the 5' end of the DNA 
polymerase gene. Oligonucleotide primers hybridizing to 
the DNA polymerase gene of herpes viruses related to 
RFHV and KSHV are described in commonly owned U.S. 
patent application Ser. No. 60/001,148. The DNA poly- 
merase encoding region is located 3 f to the Glycoprotein B 
encoding region. PCRs conducted using this primer combi- 
nation are expected to amplify polynucleotides comprising 
the 3' end of the Glycoprotein B encoding region, any 
intervening sequence if present, and the 5' end of the DNA 
Polymerase encoding region. 

This strategy was implemented as follows: 

DNA containing KSHV encoding sequences for Glyco- 
protein B was prepared from a frozen Kaposi's sarcoma 
sample, designated RiGr, and a cell line derived from a body 
cavity lymphoma, designated BC-1. 

In order to obtain the full 5' sequence, a Type 1 oligo- 
nucleotide probe was designed for the encoding sequence 
suspected of being upstream of Glycoprotein B: namely, the 
capsid maturation gene (CAPMAT). Known sequences of 
CAPMAT from other viruses were used to identify a rela- 
tively conserved region, and design a consensus-degenerate 
primer designated FENSA to hybridize with CAPMAT in 
the sense orientation of Glycoprotein B. A Type 1 oligo- 
nucleotide probe was designed for the encoding sequence 



6,015,565 

71 72 

suspected of being downstream of Glycoprotein B: namely, length were purified on agarose gels using the 
the DNA polymerase. These oligonucleotides are listed in QIAQUICK™ PCR purification kit from Quiagen. Purified 
Table 9: PCR products were reamp lined in a second round of ampli- 



TABLE 9 



Additional Type 1 Oligonucleotides used for Detecting, Amplifying, or 
Characterizing Herpes Virus Polynucleotides 

Sequence No. of Orien- SEQ 

Desig- nation (5' to 3') Length forms tation ID: 



Target: Capsid/Maturation gene from Herpes Viruses, especially from gamma Herpes Viruses 



FENSAC GCCTTTGAGAATTCYAARTAYATHAAR 27 48 sense 77 

FENSAG GGGUTGAGAAUCYAARTAYATHAAR 27 48 sense 78 

Target: DNA polymerase gene from Herpes Viruses, especially from gamma Herpes Viruses 



CVNVB TAAAAGTACAGCTCCTGCCCGAANACRTTNAC 35 64 antisense 79 

RCA 



Amplification was carried out using pairs of sense and 
antisense primers that covered the entire Glycoprotein B 
encoding region. Fragments obtained include those listed in 
Table 10. 

TABLE 10 

KSHV Glycoprotein B fragments obtained 



Fragment 



Length Position 



1 


NIVPA 


TVNCB 


0.39 kb 


original fragment 




2 


FENSA 


VNVNB 


0.9 kb 


5' of fragment 1 across 


to CAPMAT 


3 


TVNCA FREYB 


2.3 kb 


3' of fragment 1' 




4 


FAYDA 


FREYB 


0.65 kb 


3' of fragment 1 




5 


SQPVA 


HVLQB 


2.5 kb 


3' of fragment 1 across 
DNA polymerase 


to 


6 


FREYA -» 


SCGFB 


1.1 kb 


3' of fragment 2 across 
DNA polymerase 


to 



The protocol used for amplifying and sequencing was as 
follows: PCR amplification was carried out using the DNA 
template with the primer pair (e.g., FREYA and SCGFB). 35 
cycles were conducted of 94° C. for 45 sec. 60° C. for 45 sec. 
and 72° C. for 45 sec; and then followed by a final extension 
step at 72° C. for 10 min. PCR products of the predicted 



fication. The second round was conducted alternatively in a 
nested or non-nested fashion. In the example given, second- 
round amplification was conducted using FREYA and 
SCGFB, or with FREYA and HVLQB. Amplification for 35 
cycles was conducted at 94° C. for 45 sec, 65° C. for 45 sec, 

25 and 72° C. for 45 sec; and then followed by a final extension 
step at 72° C. for 60 min. 

The PCR products were ligated into the Novagen PT7 
BLUE™ vector, and transformed into Novablue competent 
E. coli. Ligations and transformations were performed using 

30 Novagen protocols. Colonies were screened by PCR using 
M13 forward and reverse oligonucleotides. Using the Qui- 
aquick plasmid isolation kit, plasmids were isolated from 
PCR positive colonies that had been grown up overnight in 
5 mL LB broth at 37° C. Manual sequencing of the plasmids 

35 using Ml 3 forward and reverse sequencing primers was 
performed following the USB Sequenase Kit protocol 
(USB). Automated sequencing was performed by ABI meth- 
ods. 

Additional KSHV-specific Type 3 oligonucleotides were 
40 designed as the KSHV sequence emerged. Type 3 oligo- 
nucleotides were used in various pair combinations or with 
Type 1 oligonucleotides to PCR amplify, clone, and 
sequence sections of the KSHV DNA. The Type 3 oligo- 
nucleotides used are listed in Table 11: 



TABLE 11 



Additional Type 3 Oligonucleotides used for Detecting; Amplifying, or 
Characterizing Herpes Virus Polynucleotides encoding Glycoprotein B 

Desig- Sequence No. of SEQ 

nation (5 1 to 3') Length forms Orientation ID: 

Target: Glycoprotein B from KSHV 



GAYTA 


TGTGGAAACGGGAGCGTACAC 


21 


1 


sense 


SO 


DTYSB 


TCAGACAAGAGTACGTGTCGG 


21 


1 


anti-sense 


81 


AIYGB 


TACAGGTCGACCGTAGATGGC 


21 


1 


ariti- sense 


82 


VTECA 


CGCCATTTCCGTGACCGAGTG 


21 


1 


sense 


83 


CEHYB 


TGATGAAGTAGTGTTCGCAGG 


21 


1 


anti-sense 


84 


DLGGB 


GATGCCACCCAGGTCCGCCAC 


21 


1 


anti-sense 


85 


DLGGA 


GTGGCGGACCTGGGTGGCATC 


21 


1 


sense 


86 


RAPPA 


CGTAGATCGCAGGGCACCTCC 


21 


1 


sense 


87 




Target: DNA Polymerase from KSHV 






GEVFB 


GTCTCTCCCGCGAATACTTCT 


21 


1 


antisense 


88 


HVLQB 


GAGGGCCTGCTGGAGGACGTG 


21 


1 


antisense 


89 


SCGFB 


CGGTGGAGAAGCCGCAGGATG 


21 


1 


antisense 


90 
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FIG. 18 is a map showing the location where oligonucle- 
otides hybridize with the KSHV DNA. Abbreviations used 
are as follows: d or h=consensus-degenerate probes that 
hybridize with herpesvirus sequences (Type 1), 
sq=additional sequencing tail available, g=probes that 5 
hybridize with gamma herpesviruses (Type 1), f=probes that 
hybridize with KSHV/RFHV family of herpesviruses (Type 
2), ks=probes specific for KSHV (Type 3). 

FIG. 19 lists a consensus sequence obtained by compiling 
sequence data from each of the characterized fragments. The 10 
polynucleotide sequence (SEQ. ID NO: 91) is shown. Nucle- 
otides 1-3056 (SEQ. ID NO: 92) incorporating the region 
before the DNA polymerase encoding sequence is an 
embodiment of this invention. This consensus sequence 
represents the consensus of data obtained from both the 15 
Kaposi 5 s5 sarcoma sample RiGr, and the lymphoma cell line 
BC-1, with a plurality of clones being sequenced for each 
sample and each gene segment. Between about 3—9 deter- 
minations have been performed at each location. 

Also shown in FIG. 19 is the amino acid translation of the 20 
three open reading frames (SEQ. ID NOS:93-95). The 
encoded CAPM AT protein fragment (SEQ. ID NO: 93) over- 
laps the 5' end of the Glycoprotein B encoding sequence 
(SEQ. ID NO: 94) in a different phase. Further upstream, the 
CAPMAT encoding sequence is also suspected of compris- 25 
ing control elements for Glycoprotein B transcription, due to 
homology with the binding site for RNA polymerase 2 of 
Epstein Barr Virus. This putative promoter region is under- 
lined in the Figure. At the 3' end of the Glycoprotein B 
encoding sequence, there is an untranslated sequence includ- 30 
ing a polyadenlyation signal. Further downstream is the 
encoding sequence for a DNA Polymerase fragment (SEQ. 
ID NO:95). 

When the Glycoprotein B encoding sequence was com- 35 
pared with other sequences on GenBank, homology was 
found only with Glycoprotein B sequences from other 
herpes viruses. Occasional sequences of 20 nucleotides or 
less are shared with several herpes viruses. The sequence 
ATGTTCAGGGAGTACAACTACTACAC (SEQ. ID 4Q 
NO: 98) is shared with eHV2. Other than this sequence, 
segments of the KSHV encoding region 21 nucleotides or 
longer are apparently unique, compared with other previ- 
ously known sequences. 

Within the Glycoprotein B encoding sequence, four allelic 45 
variants were noted at the polynucleotide level between 
sequence data obtained using the Kaposi's sarcoma sample 
and that obtained using the body cavity lymphoma cell line. 
These are indicated in the Figure by arrows. All but one of 
the variants was silent. The fourth variant causes a difference 50 
of Proline to Leucine in the gene product. 

The protein product encoded by the KSHV Glycoprotein 
B gene has the following features: There is a domain at the 
N- terminus that corresponds to the signal-peptide domain 
(the "leader") of Glycoprotein B other herpes viruses. The 55 
complete KSHV Glycoprotein B amino acid sequence with 
that known for other herpes viruses is provided in FIG. 3, 
and reveals areas of homology. Residues highly conserved 
amongst herpes virus Glycoprotein B sequences are marked 
with an asterisk (*). The cysteine residues conserved 60 
amongst other herpes virus Glycoprotein B sequences are 
also present in that of KSHV. In addition, there are two 
additional cysteines which could form an additional internal 
disulfide and stabilize the three-dimensional structure 
(marked by "•"). The KSHV Glycoprotein B sequence also 65 
has a predicted membrane-spanning domain that corre- 
sponds to that on Glycoprotein B of other herpes viruses. 
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Another feature of the KSHV Glycoprotein B is the 
presence of an RGD triplet near the N-terminal of the mature 
protein. The same triplet is present in proteins such as 
fibrinogen, fibronectin, vitronectin, thrombospondin, 
osteopontin, and laminin, and has been shown to direct 
binding of these proteins to cell surfaces via integrin recep- 
tors. The RGD domain of lamanin has been shown to bind 
to endothelial cells and binding of laminin mediates differ- 
entiation and the production of capillary-like structures in 
vitro (Grant et al.). RGD domains are part of the cell 
adhesion sites of fibronectin and vitronectin (Ruoslahti et al., 
Humphries et al.). 

The upper panel of FIG. 24 provides a comparison of the 
RGD domain in the KSHV Glycoprotein B protein sequence 
with other known RGD sequences. The residues flanking the 
RGD triplet show some similarity between the proteins. In 
particular, a number of the sequences have serine (S) and 
threonine (T) residues immediately flanking the RGD triplet, 
other T, S, F, and P residues to the C-terminal side, and a C 
residue in the N-terminal direction. 

The lower panel of FIG. 24 shows an alignment of the 
KSHV Glycoprotein B protein sequence with the Glycopro- 
tein B sequence of other gamma herpes viruses. Potential 
peptidase cleavage sites for the KSHV protein are indicated, 
based on the possession of cleavage sites in the other 
sequences. The RGD triplet is located about 3—9 residues 
from the expected N-terminus of the mature protein. There 
is no RGD sequence present in the Glycoprotein B of 
gamma herpes viruses outside the RFHV/KSHV subfamily. 
If the triplet mediates the infectivity or pathology of the 
KSHV virus, this property may be unique in comparison to 
viruses outside the subfamily. 

The presence of an RGD domain at the N-terminus of the 
KSHV glycoprotein B suggests that the domain mediates 
attachment of KSHV to cells containing an appropriate 
integrin receptor, such as B-lymphocytes and endothelial 
cells, leading to infection of these cells. It is also possible 
that the domain mediates the differentiation of infected 
endothelial cells into capiflary-like structures that are char- 
acteristic of Kaposi's sarcoma lesions. Blocking the attach- 
ment of KSHV to cells through the Glycoprotein B RGD 
domain may inhibit infection, tumor formation, or angio- 
genesis. 

The RGD triplet in Glycoprotein B is potentially impor- 
tant in therapeutic approaches to KSIV infection in several 
respects. In one example, it may be of benefit to develop 
vaccines that are based on or enriched for Glycoprotein B 
peptides that incorporate the RGD sequence. Using a KSHV 
peptide of 7—20 amino acids encompassing this region, 
enough immunogenicity may be present to elicit antibodies 
for which the RGD would be part of the triplet. Circulating 
antibodies with this specificity may rapidly sequester the 
RGD site, and decrease any ability of this region to partici- 
pate in viral infectivity or pathology. 

In another example, peptides based on the KSHV Glyco- 
protein B sequence and including the RGD triplet may also 
inhibit viral infectivity or pathology, and could be admin- 
istered immediately to counter an acute exposure. To the 
extent that binding to the RGD receptor also depends on 
residues in the ligand that neighbor the RGD triplet, the 
inhibition may be somewhat selective for KSHV virus in 
comparison with other RGD -bearing substances. 

The full glycoprotein B sequence of RFHV is obtained by 
a similar strategy to that used for obtaining the KSHV 
sequence. The source for RFHV DNA is similarly prepared 
tissue from infected monkeys at the University of Washing- 
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ton Regional Primate Research Center. DNA is extracted as 
described in Example 5. 

In order to obtain further sequence data in the 5' direction 
of the coding strand, amplifications are conducted using the 
consensus-degenerate (Type 1) oligonucleotide hybridizing 5 
upstream from the fragment as the 5' primer, in combination 
with the closest virus -specific (Type 3) oligonucleotides as 
the 3' primers. Thus, a first series of amplification cycles are 
conducted, for example, using FRFDA and AAITB as the 
first set of primers. This is followed by a second series of 10 
amplification cycles, conducted the same primers, or using 
the nested set FRFDA and GMTEB. Amplification condi- 
tions are similar to those described for KSHV. 

In order to obtain further sequence data in the 3' direction 
of the coding strand, amplifications are conducted using 15 
consensus-degenerate (Type 1) oligonucleotides hybridizing 
downstream from the fragment as the 3' primer, in combi- 
nation with the closest virus- specific (Type 3) oligonucle- 
otides as the 5' primers. Thus, a first series of amplification 
cycles are conducted using NVFDB and VEGLA, followed 20 
by a second series conducted using NVFDB and PVLYA. 
Amplification and sequencing is performed as before. The 
new sequence is used to design further Type 3 oligonucle- 
otides with a sense orientation, which are used with other 
downstream-hybridizing Type 1 oligonucleotides (namely 25 
FREYB and NVFDB) to obtain further sequence data. 

Polynucleotide and amino acid sequence data is used to 
compare the Glycoprotein B of RFHV and KSHV with each 
other, and with that of other herpes viruses. The RFHV and 

30 

KSHV sequences may be used to design further subfamily- 
specific Type 2 oligonucleotides, as in Example 6. 

Example 8: Glycoprotein B sequences from DNA 
libraries 

35 

Complete Glycoprotein B sequences can be obtained or 
confirmed by generating DNA libraries from affected tissue. 
Sources of DNA for this study are the same as for Example 
7. 

The DNA lysate is digested with proteinase K, and DNA 40 
is extracted using phenol-chloroform. After extensive 
dialysis, the preparation is partially digested with the Sau3A 
I restriction endonuclease. The digest is centrifuged on a 
sucrose gradient, and fragments of about 10—23 kilobases 
are recovered. The lambda DASH-2™ vector phage 45 
(Stratagene) is prepared by cutting with BamHI. The size- 
selected fragments are then mixed with the vector and 
ligated using DNA ligase. 

The ligated vector is prepared with the packaging extract 
from Stratagene according to manufacturer's directions. It is 50 
used to infect XL1-BLUE™ MRA bacteria. About 200,000 
of the phage -infected bacteria are plated onto agar at a 
density of about 20,000 per plate. After culturing, the plates 
are overlaid with nitrocellulose, and the nitrocellulose is cut 
into fragments. Phage are eluted from the fragments and 55 
their DNA are subjected to an amplification reaction using 
appropriate virus-specific primers. The reaction products are 
run on an agarose gel, and stained with ethidium bromide. 
Phage are recovered from regions of the plate giving ampli- 
fied DNA of the expected size. The recovered phage are used 60 
to infect new XL1 bacteria and re -plated in fresh cultures. 
The process is repeated until single clones are obtained at 
limiting dilution. 

Each clone selected by this procedure is then mapped 
using restriction nucleases to ascertain the size of the 65 
fragment incorporated. Inserts sufficiently large to incorpo- 
rate the entire Glycoprotein B sequence are sequenced at 
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both ends using vector-specific primers. Sequences are com- 
pared with the known polynucleotide sequence of the entire 
EBV genome to determine whether the fragment spans the 
intact Glycoprotein B sequence. DNA is obtained from 
suitable clones, sheared, and sequenced by shot-gun cloning 
according to standard techniques. 

Example 9: Antigenic regions of Glycoprotein B 

The polynucleotide fragments between the hybridization 
sites for NIVPA and TVNCB in the Glycoprotein B gene 
have the predicted amino acid sequences shown in FIG. 14. 
Based on these sequences, peptides that are unique for 
RFHV or KSHV, or that are shared between species can be 
identified. 

FIG. 14 shows example peptides of 6 or 7 amino acids in 
length. Some of the peptides comprise one or more residues 
that are distinct either for RFHV or KSHV (Class III), or for 
the RFHV/KSHV subfamily (Class II) compared with the 
corresponding gamma herpes virus peptides. 

To confirm that regions contained within this 106 -amino 
acid region of Glycoprotein B may be recognized by 
antibody, computer analysis was performed to generate 
Hopp and Woods antigenicity plots. The Hopp and Woods 
determination is based in part on the relative hydrophilicity 
and hydrophobicity of consecutive amino acid residues 
(Hopp et al). 

Results are shown in FIGS. 20, 21 and 22. Key: 
~= antigenic; "=hydrophobic; #=potential N-linked glycosy- 
lation site. FIG. 20 shows the analysis of the 106 amino acid 
Glycoprotein B fragment from RFHV; FIG. 21 shows the 
analysis of the KSHV fragment, and FIG. 22 shows the 
analysis of the full KSHV sequence. 

Both RFHV and KSHV contain several regions predicted 
to be likely antibody target sites. In particular, the KSHV 
sequence shows an antigenic region near the N- terminal end 
of this fragment, and near the potential N-linked glycosy- 
lation site. The full-length KSHV sequence shows hydro- 
phobic minima corresponding both to the signal peptide 
(residue —25) and the transmembrane domain (residue 
—750). A number of putative antigenic regions with scores 
>1.0 or >1.5 are observed. Particularly notable is a region 
scoring up to —2.5 that appears at about residues 440—460. 

Example 10: Virus specific Glycoprotein B 
amplification assays 

Type 3 oligonucleotides are used in nested virus-specific 
amplification reactions to detect the presence of RFHV or 
KSHV in a panel of tissue samples from potentially affected 
subjects. 

For KSHV, DNA is extracted from tissue suspected of 
harboring the virus; particularly biopsy samples from human 
subjects with Kaposi's Sarcoma lesions and body cavity 
B-cell lymphoma. A number of different tissue samples are 
used, including some from KS lesions, some from appar- 
ently unaffected tissue in the same individuals, some from 
HIV positive individuals with no apparent KS lesions, and 
some from HIV negative individuals. Five samples are 
obtained in each category. DNA is prepared as described in 
Example 2. 

The oligonucleotide primers GLTEA, YELPA, VNVNB, 
and ENTFB are ordered from Oligos Etc., Inc. The DNA is 
amplified in two stages, using primers GLTEA and ENTFB 
in the first stage, and YELPA and VNVNB in the second 
stage. The conditions of the amplification are similar to 
those of Example 3. The reaction product is electrophoresed 
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on a 2% agarose gel, stained with ethidium bromide, and 
examined under U. V. light. A positive result is indicated by 
the presence of abundant polynucleotide in the reaction 
product, as detected by ethidium bromide staining. This 
reflects the presence of KSHV derived DNA in the sample; 5 
specifically, the Glycoprotein B encoding fragment from 
YELPA to VNVNB. Results are matched with patient his- 
tory and sample histopathology to determine whether posi- 
tive assay results correlate with susceptibility to KS. 

For RFHV, DNA is extracted from frozen tissue samples 10 
taken from Macaca nemestrina and Macaca fascicularis 
monkeys living in the primate colony at the Washington 
Regional Primate Research Center. Ten samples are taken 
each from tissue sites showing overt symptoms of 
fibromatosis, apparently unaffected sites in the same 15 
monkeys, and corresponding sites in monkeys showing no 
symptomatology. Nested PCR amplification is conducted 
first using GMTEA and VEGLB, then using KYEIA and 
TDRDB. Amplification product is electrophoresed and 
stained as before, to determine whether RFHV polynucle- 20 
otide is present in the samples. 

Example 11: Immunogenic regions of Glycoprotein 
B 

25 

To identify what antibodies may be generated during the 
natural course of infection with KSHV, serum samples are 
obtained from 10— 20 AIDS subjects with Kaposi's Sarcoma 
lesions, from 10—20 HIV-positive symptom-negative 
subjects, and 10—20 HIV-negative controls. In initial studies, 3Q 
sera in each population are pooled for antibody analysis. 

Peptides 12 residues long are synthesized according to the 
entire predicted extracellular domain of the mature KSHV 
Glycoprotein B molecule. Sequential peptides are prepared 
covering the entire sequence, and overlapping by 8 residues. 35 
The peptides are prepared on a nylon membrane support by 
standard F-Moc chemistry, using a SPOTS™ kit from 
Genosys according to manufacturer's directions. Prepared 
membranes are overlaid with the serum, washed, and over- 
laid with beta-galactose conjugated anti-human IgG. The 40 
test is developed by adding the substrate X-gal. Positive 
staining indicates IgG antibody reactivity in the serum 
against the corresponding peptide. 

Similarly, to identify antibodies formed in the natural 
course of RFHV infection, blood samples are collected from 45 
10 Macaca nemestrina and 10 Macaca fascicularis monkeys, 
a proportion of which display overt symptoms of fibroma- 
tosis. The presence or absence of an ongoing RFHV infec- 
tion is confirmed by conducting PCR amplification assays 
using RFHV- specific oligonucleotides as in Example 10. 50 
Plasma and blood cells are separated by centrifugation. 
These sera are used to test for antibodies in a method similar 
to that for KSHV, except that 12-mers are synthesized based 
on the RFHV Glycoprotein B sequence. 

Select RFHV and KSHV peptides are also tested in 55 
animal models to determine immunogenicity when admin- 
istered in combination with desirable adjuvants such as alum 
and DETOX™. Suitable peptides include those identified in 
the aforementioned experiment as eliciting antibody during 
the natural course of viral infection. Other candidates 60 
include those believed to participate in a biological function 
of Glycoprotein B, and those corresponding to peptides of 
other herpes viruses known to elicit viral neutralizing anti- 
bodies. The peptides are coupled onto keyhole limpet 
hemocyanin (KLH) as a carrier, combined with adjuvant 65 
according to standard protocols, and 100 jug peptide equiva- 
lent in 1—2 mL inoculum is injected intramuscularly into 
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rabbits. The animals are boosted with a second dose 4 weeks 
later, and test-bled after a further 2 weeks. 

Microtiter plate wells are prepared for ELISAby coating 
with the immunogen or unrelated peptide -KLH control. The 
wells are overlaid with serial dilutions of the plasma from 
the test bleeds, washed, and developed using beta-galactose 
anti-human IgG and X-gal. Positive staining in the test wells 
but not the control wells indicates that the peptide is immu- 
nogenic under the conditions used. 

Example 12: Identification and characterization of 
Glycoprotein B from other members of the RFHV/ 
KSHV subfamily 

Tissue samples suspected of containing a previously 
undescribed gamma herpes virus, particularly fibroprolif- 
erative conditions, lymphocyte malignancies, and conditions 
associated with immunodeficiency and immunosuppression, 
such as acute respiratory disease syndrome (ARDS), are 
preserved by freezing, and the DNA is extracted as in 
Example 2. Two rounds of PCR amplification are conducted 
using Type 1 oligonucleotides FRFDA and TVNCB in the 
first round, then using nested Type 1 or Type 2 oligonucle- 
otides in the second round. 

Optionally, the presence of an RFHV/KSHV family Gly- 
coprotein B polynucleotide is confirmed by probing the 
amplification product with a suitable probe. The amplified 
polynucleotide is electrophoresed in agarose and blotted 
onto a nylon membrane. The blot is hybridized with a probe 
comprising the polynucleotide fragment obtained from the 
KSHV polynucleotide encoding Glycoprotein B (residues 
36-354 of SEQ. ID NO:3), labeled with 32 P. The hybrid- 
ization reaction is done under conditions that will permit a 
stable complex forming between the probe and Glycoprotein 
B from a herpes virus, but not between the probe and 
Glycoprotein B encoding polynucleotides from sources out- 
side the RFHV/KSHV subfamily. Hybridization conditions 
will require approximately 70% identity between hybridiz- 
ing segments of the probe and the target for a stable complex 
to form. These conditions are calculated using the formula 
given earlier, depending on the length and sequence of the 
probe and the corresponding sequence of the target. The 
conditions are estimated to be: a) allowing the probe to 
hybridize with the target in 6xSSC (0.15M NaCl, 15 mM 
sodium citrate buffer) at room temperature in the absence of 
formamide; and b) washing newly formed duplexes for a 
brief period (5—10 min) in 2xSSC at room temperature. 

Amplified polynucleotides that hybridize to the labeled 
probe under these conditions are selected for further char- 
acterization. Alternatively, PCR amplification products hav- 
ing about the same size as that predicted from the KSHV are 
suspected of having a related sequence. Samples may also be 
suspected of having a related sequence if they have been 
used to obtain polynucleotides encompassing other regions 
of a herpes virus genome, such as DNA polymerase. 
Samples containing fragments potentially different from 
RFHV or KSHV, either due to a size difference or different 
origin, are sequenced across the fragment as in Example 4. 
Those with novel sequences are used to determine the entire 
Glycoprotein B gene sequence by a method similar to that in 
Example 7 or 8. 

A Glycoprotein B encoding sequence from a third mem- 
ber of the RFHV/KSHV herpes virus subfamily was 
obtained as follows. 

DNA was extracted from two frozen tissue samples from 
a. Macaca mulatta monkey with retroperitoneal fibromatosis. 
Extraction was conducted according to Example 1. The 
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extracted DNA was precipitated with ethanol in the presence 
of 40 ug glycogen as carrier, washed in 70% ethanol, and 
resuspended in 10 mM Tris buffer, pH 8.0. The extracted 
DNA was used to obtain a 151 base pair fragment of a herpes 
virus DNA polymerase gene, which was non-identical to 
that of KSHV, RFHV, or any other previously characterized 
DNA polymerase. This lead to the suspicion that the sample 
contained genomic DNA from a different herpes virus, that 
could be used to identify and characterize a new Glycopro- 
tein B gene. 

A 386 base pair fragment of a Glycoprotein B encoding 
sequence was amplified from the sample using a hemi- 
nested PCR. The procedure was similar to that used in 
Examples 4 and 5, with a first round of amplification using 
FRFDA and TVNCB, followed by a second round of ampli- 
fication using NIVPA and TVNCB. The final PCR product 
was sequenced as before. 

FIG. 23 lists the polynucleotide sequence (SEQ. ID 
NO: 96) along with the corresponding amino acid translation 
(SEQ. ID NO: 97). Underlined is the 319 base pair sequence 
in between the two primer hybridization sites. The 
sequences are different from those of KSHV and RFHV The 
Glycoprotein B is from a new member of the RFHV/KSHV 
subfamily of herpes viruses, designated RFHV2. 
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SEQUENCES 

SEQ. ID Designation Description Type Source 



1 




RFHV 


Glycoprotein B PCR segment 


dsDNA 


FIG. 1 


2 




RFHV 


Glycoprotein B PCR segment 


Protein 


FIG. 1 


3 




KSHV 


Glycoprotein B PCR segment 


dsDNA 


FIG. 1 


4 




KSHV 


Glycoprotein B PCR segment 


Protein 


FIG. 1 


5 




sHVl 


Glycoprotein B sequence 


dSDNA 


GenBank 












HSVSPOLGBP 


6 




bHV4 


Glycoprotein B sequence 


dsDNA 


GenBank 












BHT4GLYB 


7 




eHV2 


Glycoprotein B sequence 


dsDNA 


GenBank 












EHVU20824 


8 




mHV68 


Glycoprotein B sequence 


dsDNA 


GenBank 












MVU08990 


9 




hEBV 


Glycoprotein B sequence 


dsDNA 


GenBank 












EBV 


10 




hCMV 


Glycoprotein B sequence 


dsDNA 


GenBank 












HEHCMVGB 


11 




hHV6 


Glycoprotein B sequence 


dsDNA 


GenBank 












HH6GBXA 


12 




hVZV 


Glycoprotein B sequence 


dsDNA 


GenBank 












HEVZVXX 


13 




HSV1 


Glycoprotein B sequence 


dsDNA 


GenBank 












HS1GLYB 


14 




sHVl 


Glycoprotein B sequence 


Protein 


Translation 


15 




bHV4 


Glycoprotein B sequence 


Protein 


Translation 


16 




eHV2 


Glycoprotein B sequence 


Protein 


Translation 


17 




mHV68 


Glycoprotein B sequence 


Protein 


Translation 


IS 




hEBV 


Glycoprotein B sequence 


Protein 


Translation 


19 




hCMV 


Glycoprotein B sequence 


Protein 


Translation 


20 




hHV6 


Glycoprotein B sequence 


Protein 


Translation 


21 




hVZV 


Glycoprotein B sequence 


Protein 


Translation 


22 




HSV1 


Glycoprotein B sequence 


Protein 


Translation 


23 




sHVSA8 


Glycoprotein B sequence 


Protein 


Translation 


24^0 




TYPE 1 oligonucleotides 


ssDNA 


Table 4 








(Gamma herpes Glycoprotein B) 


(IUPAC) 




41^7 




TYPE 2 oligonucleotide 


ssDNA 


Table 6 








(RFHV/KSHV subfamily Glycoprotein B) 


(IUPAC) 




48- 


-55 




TYPE 3 oligonucleotides - 


ssDNA 


Table 7 








RFHV specific Glycoprotein B 






56- 


-63 




TYPE 3 oligonucleotides - 


ssDNA 


Table 7 








KSHV specific Glycoprotein B 






64- 


-66 




CLASS I antigen peptides 


Protein 


Table 8 








(Gamma herpes Glycoprotein B) 






67- 


-72 




CLASS II antigen peptides 


Protein 


Table 8 








(RFHVIKSHV subfamily Glycoprotein B) 






73- 


-74 




CLASS III antigen peptides- 


Protein 


Table 8 



RFHV specific Glycoprotein B 
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SEQUENCES 



SEQ. ID 


Designation 


Description 


Type 


Source 


75-76 




CLASS III antigen peptide s- 
KSHV specific Glycoprotein B 


Protein 


Table 8 


77-78 




TYPE 1 oligonucleotide 

(Gamma herpes Caps id maturation) 


ssDNA 
(IUPAC) 


Table 9 


79 




TYPE 1 oligonucleotide 

(Gamma herpes DNA polymerase) 


ssDNA 
(IUPAC) 


Table 9 


80-87 




TYPE 3 oligonucleotides - 
KSHV specific Glycoprotein B 




Table 11 


88-90 




TYPE 3 oligonucleotides - 
KSHV specific DNA Polymerase 




Table 11 


91 


KSHV 


DNA sequence comprising encoding regions 
for Capsid Maturation fragment, Glycoprotein 
B, and DNA polymerase fragment 


dsDNA 


FIG. 18 


92 


KSHV 


DNA sequence comprising encoding regions 
for Capsid Maturation fragment and 
Glycoprotein B (residues 1—3056) 


dsDNA 


Example 7 


93 


KSHV 


Capsid Maturation sequence 


Protein 


FIG. 18 


94 


KSHV 


Glycoprotein B sequence 


Protein 


FIG. 18 


95 


KSHV 


DNA polymerase sequence 


Protein 


FIG. 18 


96 


RFHV2 


Glycoprotein B PCR segment 


dsDNA 


FIG. 22 


97 


RFHV2 


Glycoprotein B PCR segment 


Protein 


FIG. 22 


98 




Shared sequence 


dsDNA 


Example 7 


99-100 




CLASS I antigen peptides of Glycoprotein B 


Protein 


Table 8 


101-105 




Signal peptidase cleavage regions 


Protein 


FIG. 24 


106-113 




Peptide sequences comprising RGD domains 


Protein 


FIG. 24 



SEQUENCE LISTING 



( 1 ) GENERAL INFORMATION : 

(iii) NUMBER OF SEQUENCES: 113 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

GTGTACAAGA AGAACATCGT GCCGTACATT TTCAAGGTAC GCAGGTACAT AAAAAT AGC A 6 0 

ACATCTGTCA CGGTCTACCG CGGTATGACA GAAGCAGCAA TCACAAACAA AT ATG AG AT C 12 0 

CCCAGGCCCG TGCCTCTCTA CGAGATCAGT C AC AT GG AC A GCACCTACCA GTGCTTTAGT 18 0 

TCCATGAAAA TTGTAGTGAA CGGAGTCGAA AATACGTTCA CCGATCGGGA TGACGTAAAC 24 0 

AAAACCGTAT TTCTCCAGCC CGTCGAAGGT CTAACTGACA ACATACAAAG AT ACT TT AGC 30 0 

CAACCAGTAC TGTACTCTGA ACCCGGATGG TTCCCAGGTA TCTACAGGGT TGGGACAACA 36 0 

GTAAACTGTG AGATTGTAGA CATGTT 38 6 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 



Val Tyr Lys Lys Asn lie Val Pro Tyr lie Phe Lys Val Arg Arg Tyr 
15 10 15 
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-continued 



He Lys He Ala Thr Ser Val Thr Val Tyr Arg Gly Met Thr Glu Ala 
20 25 30 

Ala He Thr Asn Lys Tyr Glu He Pro Arg Pro Val Pro Leu Tyr Glu 
35 40 45 

He Ser His Met Asp Ser Thr Tyr Gin Cys Phe Ser Ser Met Lys He 
50 55 60 

Val Val Asn Gly Val Glu Asn Thr Phe Thr Asp Arg Asp Asp Val Asn 
65 70 75 80 

Lys Thr Val Phe Leu Gin Pro Val Glu Gly Leu Thr Asp Asn He Gin 
85 90 95 

Arg Tyr Phe Ser Gin Pro Val Leu Tyr Ser Glu Pro Gly Trp Phe Pro 
100 105 110 

Gly He Tyr Arg Val Gly Thr Thr Val Asn Cys Glu He Val Asp Met 
115 120 125 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

GTGTACAAGA AGAACATCGT GCCGTATATT TTTAAGGTGC GGCGCTATAG GAAAATTGCC 6 0 

ACCTCTGTCA C GGTCT AC AG GGGCTTGACA GAGTCCGCCA TCACCAACAA GTATGAACTC 12 0 

CCGAGACCCG TGCCACTCTA TGAGATAAGC C AC AT GG AC A GCACCTATCA GTGCTTTAGT 18 0 

TCCATGAAGG TAAATGTCAA C GG GG T AG AA AACACATTTA CTGACAGAGA CGATGTTAAC 24 0 

ACCACAGTAT TCCTCCAACC AGTAGAGGGG CTTACGGATA AC ATT C AAAG GTACTTTAGC 30 0 

CAGCCGGTCA TCTACGCGGA ACCCGGCTGG TTTCCCGGCA TATACAGAGT TAGGACAACA 36 0 

GTCAACTGTG AGATTGTAGA CATGTT 38 6 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Val Tyr Lys Lys Asn He Val Pro Tyr He Phe Lys Val Arg Arg Tyr 
15 10 15 

Arg Lys He Ala Thr Ser Val Thr Val Tyr Arg Gly Leu Thr Glu Ser 
20 25 30 

Ala He Thr Asn Lys Tyr Glu Leu Pro Arg Pro Val Pro Leu Tyr Glu 
35 40 45 

He Ser His Met Asp Ser Thr Tyr Gin Cys Phe Ser Ser Met Lys Val 
50 55 60 

Asn Val Asn Gly Val Glu Asn Thr Phe Thr Asp Arg Asp Asp Val Asn 
65 70 75 80 

Thr Thr Val Phe Leu Gin Pro Val Glu Gly Leu Thr Asp Asn He Gin 

85 90 95 



Arg Tyr Phe Ser Gin Pro Val He Tyr Ala Glu Pro Gly Trp Phe Pro 
100 105 110 
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Gly lie Tyr Arg Val Arg Thr Thr Val Asn Cys Glu lie Val Asp Met 
115 120 125 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2425 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

ATGGTACCTA AT AAAC AC TT ACT GC TT AT A ATTTTGTCGT TTTCTACTGC ATGTGGACAA 6 0 

ACGACACCTA CTACAGCTGT TGAAAAAAAT AAAACTCAAG C T AT AT ACC A AGAGTATTTC 12 0 

AAATATCGTG TATGTAGTGC ATCAACTACT GGAGAAT TGT TTAGATTTGA TTTAGACAGA 18 0 

ACTTGTCCAA GTACTGAAGA CAAAGTTCAT AAGGAAGGCA TTCTTTTAGT GTACAAAAAA 24 0 

AATATAGTTC CATATATCTT TAAAGTCAGA AGATACAAAA AAATCACAAC ATCAGTCCGT 30 0 

ATTTTTAATG GCT GGACT AG AGAAGGTGTT GCTATTACAA ACAAATGGGA ACTTTCTAGA 36 0 

GCTGTTCCAA AATATGAGAT AGATATTATG GATAAGACTT ACCAATGTCA TAATTGCATG 42 0 

CAGATAGAAG TAAACGGAAT GTTAAATTCT TACTATGACA GAG AT GG AAA TAACAAAACT 48 0 

GTAGACTTAA AGCCTGTAGA TGGTCTAACG GGTGCAATTA CAAGATACAT TAGCCAACCT 54 0 

AAAGTTTTTG CTGATCCTGG CTGGCTATGG GGAAC TT AC A GGACT CG AAC TACCGTTAAC 60 0 

TGTGAAATTG TAGACATGTT TGCTAGGTCT GCTGACCCTT AC AC AT ACT T TGTGACTGCG 66 0 

CTTGGCGACA CAGTAGAAGT GTCTCCTTTC TGTGATGTAG ATAATTCATG CCCAAATGCA 72 0 

ACTGACGTGT T GT C AG T AC A AAT AG AC T T A AATCACACTG TTGTTGACTA TGGAAATAGA 78 0 

GCTACATCAC AGCAGCATAA AAAAAGAATA TTTGCTCATA CTTTAGATTA TTCTGTTTCT 84 0 

TGGGAAGCTG TAAACAAATC CGCGTCAGTA TGCTCAATGG TTTTTTGGAA GAGTTTTCAA 90 0 

CGAGCTATCC AAACTGAACA TGACTTAACT TATCATTTCA TTGCTAATGA AATAACAGCA 96 0 

GGATTCTCTA CAGTGAAAGA ACCCTTAGCA AATTTTACAA GTGATTACAA TTGTCTTATG 102 0 

ACTCATATCA AC ACT ACT TT AGAGGATAAG ATAGCAAGAG T C AAC AAT AC TCACACTCCA 108 0 

AATGGTACAG CAGAATATTA T C AAAC AG AA GGTGGAATGA TTTTAGTGTG GCAGCCATTA 114 0 

ATAGCAATAG AAT T AG AAGA AGCAATGTTG GAAGCAACTA CATCTCCAGT AACTCCTAGT 120 0 

G C AC C AAC T A GCT CAT CT AG AAGTAAGCGA GCAATAAGAA GCATAAGAGA TGTGAGTGCA 126 0 

GGTTCAGAAA ATAATGTGTT TCTATCACAA ATACAATATG CAT AT GATAA GCTACGTCAA 132 0 

AGTATCAACA ACGTGCTAGA AGAGT TAGCT AT AAC AT GGT GTAGAGAACA AGTGAGACAA 138 0 

ACAATGGTGT GGTATGAGAT AGCAAAAATT AATCCAACAA GTGTTATGAC AGCAATATAT 144 0 

GGAAAACCTG TCTCTCGTAA AGCTTTAGGA GATGTAATCT CTGTTACAGA ATGTATAAAT 150 0 

GTTGACCAAT CTAGTGTGAG C AT AC AC AAG AGTCTTAAAA CAGAAAATAA TGACATATGC 156 0 

TATTCACGGC CTCCAGTTAC ATTTAAATTT GTTAACAGTA GTCAGCTGTT TAAAGGACAG 162 0 

TTAGGGGCTA GAAATGAAAT TCTTCTGTCA GAAAGTCTTG TAGAAAATTG CCACCAAAAT 168 0 

GCAGAG AC TT TTTTTACAGC TAAAAATGAA ACTTACCACT TTAAAAATTA TGTGCATGTA 174 0 

GAAACTTTGC CAGTGAATAA CATTTCAACT TTAGACACTT TTTTAGCTCT TAACCTAACT 180 0 

TTCATAGAAA AT ATTG AC TT TAAAGCTGTT GAATTGTATT CAAGTGGAGA GAGAAAGTTA 186 0 

G C AAAC GT GT T TG ATT T AGA GACTATGTTT AGAGAATATA ACTATTACGC T C AGAGT AT A 192 0 

TCTGGCTTAA G AAAAG AT TT T GAT AAC T C T CAAAGAAACA ACAGAGACAG AAT C ATT C AA 198 0 
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GATTTTTCAG AAATTCTAGC AGACTTAGGC TCTATCGGCA AAGTTATTGT TAATGTGGCA 2 04 0 

AGCGGCGCAT TTTCTCTTTT T GG AG GT AT T GTAACAGGCA TATTAAATTT TATTAAAAAT 2 10 0 

CCTTTAGGTG GCATGTTCAC ATTTCTATTA AT AGG AG C AG TTATAATCTT AGTAATTCTA 2 16 0 

CTAGTACGGC GCACAAATAA TATGTCTCAA GCTCCAATTA GAATGATTTA C CC AG AT GT T 2 22 0 

G AGAAATC T A AAT CT ACT GT GACGCCTATG GAGCCTGAAA CAATTAAACA AATTTTGCTT 2 28 0 

GGAATGCATA ACATGCAGCA AGAAGCATAT AAGAAAAAAG AAGAACAAAG AGCTGCTAGA 2 34 0 

CCGTCTATTT TTAGACAAGC TGCTGAGACA TTTTTGCGTA AGCGATCTGG T T AC AAAC AG 2 40 0 

ATTTCAACCG AAGACAAAAT AGTAT 2 42 5 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2623 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

ATGTATTATA AGACTATCTT ATTCTTCGCT CTAATTAAGG TATGCAGTTT CAACCAGACC 6 0 

ACTACACACT CAACCACAAC CTCACCAAGT ATTTCATCAA CCACCTCTTC CACAACAACA 12 0 

TCAACAAGCA AGC CAT C AAA CACAACCTCA ACAAATAGTT CAT T AGC TGC CTCTCCCCAG 18 0 

AACACGTCAA CAAGCAAGCC ATCCACTGAT AATCAGGGTA CCAGTACCCC CACTATTCCA 24 0 

ACTGTTACTG ATGACACAGC C AG T AAAAAT TTTTATAAAT AC AG AGT AT G CAGTGCATCA 30 0 

TCTTCCTCTG GAGAACTATT CAGATTTGAC CTTGATCAGA CATGTCCAGA TACAAAAGAT 36 0 

AAAAAACATG T GGAAGGC AT CCTGCTGGTA CTAAAAAAGA AT ATT GT CC C ATACATCTTC 42 0 

AAAGTGAGGA AAT AT AGAAA AATTGCCACC TCAGTGACAG TTTACAGAGG GTGGTCCCAG 48 0 

GCAGCTGTTA C C AAT AGG GA TGATATCAGC AGAGC CAT AC CCTATAATGA AAT TT C AAT G 54 0 

ATAGATAGGA CCTATCATTG TTTCTCTGCT ATGGCAACAG TCATTAATGG GATTCTGAAC 60 0 

ACCTATATAG AC AGGG AT TC TGAAAATAAG TCTGTTCCCC TCCAGCCAGT GGCCGGACTG 66 0 

ACTGAGAACA TAAACAGATA CTTTAGTCAA CCTCTCATAT ATGCAGAACC TGGCTGGTTT 72 0 

C C AGGG AT TT ATAGAGTGAG AACAACTGTT AATTGTGAGG TTGTTGACAT GTATGCCCGC 78 0 

TCTGTGGAAC CAT AT ACT CA C TT T ATT AC A GCTCTGGGGG ACACTATTGA AATCTCCCCA 84 0 

TTCTGTCACA ACAATTCTCA ATGCACCACT GGTAATTCCA CCTCAAGGGA TGCCACAAAG 90 0 

G TAT GG AT AG AAGAAAATCA CCAAACTGTT GACTATGAAA G AC GGGGGC A TCCCACTAAA 96 0 

GATAAAAGAA TCTTTCTAAA AGATGAGGAA TAT AC C ATCT CCTGGAAAGC AGAAGATAGA 102 0 

GAGAGAGCTA T TT GTG AT TT T GT GAT AT GG AAAACCTTTC C C AGGGC CAT AC AAAC AAT C 108 0 

CATAATGAGA GCTTTCACTT TGTGGCAAAT G AAGT C AC AG CCAGCTTTTT AACATCCAAC 114 0 

CAAGAAGAAA CGGAGCTACG TGGAAATACC GAG AT AT TGA ATTGCATGAA TAGTACCATA 120 0 

AATG AAAC TC T AG AAG AG AC AGTCAAAAAA TTTAACAAAT CCCATATCAG AGATGGGGAG 126 0 

GTAAAGTACT ATAAAACAAA TGGGGGACTA TTCCTTATCT GGCAGGCAAT G AAAC CC CT T 132 0 

AATCTGTCAG AAC AC AC AAA CTACACTATT GAAAGGAATA AC AAG AC TGG AAATAAATCA 138 0 

AGACAAAAAA GGTCTGTAGA T AC AAAG AC C TTCCAAGGCG CCAAGGGCCT GTCCACTGCC 144 0 

CAGGTTCAAT ATGCCT AT GA CCATTTAAGA ACAAGCATGA ATCACATCCT AGAGGAATTA 150 0 

ACCAAAACAT GGTGCCGGGA ACAAAAAAAG GACAATCTAA TGTGGTATGA GCTGAGTAAA 156 0 
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ATTAACCCAG TGAGTGTCAT GGCAGCCATT TATGGGAAAC CTGTGGCAGT GAAAGCCATG 162 0 

G GAG AT GC AT TCATGGTTTC TGAGTGCATC AATGTTGACC AGGCAAGTGT CAATATCCAT 168 0 

AAAAGT AT G A GAACGGATGA TCCCAAGGTA T GT TACT CCA GACCCCTGGT CACATTTAAA 174 0 

TTTGTGAATA GTACTGCCAC CTTCAGGGGT CAGCTTGGAA CAAGGAATGA AATCTTGCTC 180 0 

ACAAACACAC ACGTGGAAAC TTGTAGACCA ACAGCAGATC ATTATTTTTT TGTAAAGAAC 186 0 

ATGACACACT ATTTTAAGGA CTATAAATTT GTGAAGACAA T GG AT AC C AA TAACATATCC 192 0 

ACCCTGGATA CATTTTTAAC TCTCAATTTA ACTTTTATAG ACAATATAGA TTTCAAGACA 198 0 

GTGGAACTTT ACAGTGAGAC T GAAAGAAAG ATGGCCAGTG CCCTCGACCT GGAGACG AT G 2 04 0 

TTTAGAGAGT ATAATTACTA C AC AC AG AAG CTTGCAAGTC T GAGAGAAG A TCTAGACAAC 2 10 0 

ACCATTGACC TGAACAGGGA CAGACTAGTT AAAGATC TCT CTGAAATGAT GGCAGACCTT 2 16 0 

GGAGAC AT TG GAAAAGTGGT GGTCAACACA TTCAGTGGCA TTGTCACTGT TTTTGGGTCT 2 22 0 

AT AGTT GGTG GATTTGTCAG TTTTTTCACA AACCCCATTG GGGGCGTGAC GAT CATC CT C 2 28 0 

CTTCTCATAG TTGTGGTTTT TGTTGTTTTT ATAGTCTCCA GGAGAAC C AA TAACATGAAC 2 34 0 

GAGGCCCCCA TAAAAATGAT CTATCCAAAC ATTGACAAAG CCTCTGAGCA G GAGAAC AT T 2 40 0 

CAGCCCCTAC CCGGAGAGGA GATTAAGCGC ATCCTCCTTG GAATGCACCA GCTCCAGCAA 2 46 0 

AGTGAGCACG GCAAATCTGA GGAAGAGGCT AGCCATAAAC CAGGGTTGTT CCAACTATTG 2 52 0 

GGGGATGGCC TACAATTGCT GCGCAGGCGC GGGTATACTA GGTTACCAAC TTTTGACCCC 2 58 0 

AGTCCAGGCA ATGACACATC T GAGAC AC AC CAAAAATATG TTT 2 62 3 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2625 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

ATGGGGGTCG GGGGCGGGCC TCGCGTCGTC CTCTGTCTAT GGTGCGTCGC TGCGCTTCTC 6 0 

TGCCAGGGGG T GGCGC AAGA AGTTGTGGCT G AAAC GACC A CCCCGTTCGC AACCCACAGA 12 0 

CCAGAAGTGG TGGCCGAGGA GAACCCGGCC AACCCCTTTC TGCCGTTCAG GGTATGCGGG 18 0 

GCCTCGCCTA CGGGCGGAGA GATATTCAGG TTCCCCCTGG AGGAGAGCTG CCCCAACACG 24 0 

GAAGACAAGG ACCACATAGA GGGCATAGCT CTCATCTACA AGACCAACAT AGTGCCTTAT 30 0 

GTTTTTAATG TCAGAAAGTA T AG G AAG AT C ATGACCTCGA CCACCATCTA CAAGGGTTGG 36 0 

AGCGAGGATG C C AT AAC AAA CCAGCACACG AGGAGCTACG CCGTCCCCCT GTACGAGGTC 42 0 

C AGATG AT GG ACCACTATTA TCAGTGCTTT AGCGCCGTAC AGGTCAACGA GGGGGGGCAC 48 0 

GTCAACACCT ACTATGACAG GGACGGGTGG AAC GAGACCG CCTTCCTCAA ACCGGCCGAT 54 0 

GGTCTCACCT CTAGCATAAC GCGCTATCAG AGTCAACCAG AGGTGTACGC CACCCCCAGA 60 0 

AACCTGTTGT GGT CTT AC AC AAC AAG AAC C ACAGTCAACT GCGAGGTGAC AGAGATGTCT 66 0 

GCGAGATCCA T GAAAC C ATT TGAGTTCTTT GTGACGTCTG TTGGTGACAC TAT AG AG AT G 72 0 

TCGCCCTTTT TAAAAGAAAA T GG C AC AG AG CCAGAGAAAA TCTTGAAAAG ACCACACTCT 78 0 

ATTCAACTGC T GAAAAAC T A TGCTGTCACA AAGTACGGTG TGGGGTTGGG GCAGGCTGAT 84 0 

AACGCTACCA GAT TCT TT GC AATATTTGGG GACTATTCCC TGTCTTGGAA AGCCACCACT 90 0 

GAAAACAGCT CCTACTGTGA TTTAATTTTA TGGAAGGGGT TTTCCAATGC CATTCAAACT 96 0 

CAACACAATA GCAGTCTCCA TTTTATTGCC AATGATATAA CAGCCTCCTT CTCTACTCCT 102 0 
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TTAGAAGAAG AGGCTAATTT TAACGAGACA TTTAAGTGTA TATGGAACAA CACCCAAGAA 108 0 

G AAATT C AAA AAAAGTTAAA AGAGGTTGAA AAAAC TC AC A GACCTAACGG TACTGCGAAG 114 0 

GTCTATAAAA CAACAGGCAA TCTGTACATT GTTTGGCAAC CGCTTATACA GAT AG AC CT G 120 0 

C TAG AT AC TC ATGCCAAGCT GTACAATCTC ACAAACGCTA CAGCTTCACC TACATCAACA 126 0 

CCCACAACAT CTCCCAGGAG AAGACGCAGG GAT AC TT C AA GTGTTAGTGG CGGTGGAAAT 132 0 

AATGGAGACA ACT C AACT AA GGAAGAGAGT GTGGCGGCCT CCCAGGTTCA GTTTGCCTAT 138 0 

GACAATCTCA GAAAGAGCAT CAACAGGGTG TTGGGAGAGC TGTCCAGGGC ATGGTGCAGG 144 0 

G AAC AG T AC A GGGCCTCGCT CAT GT GGT AC GAGCT GAGCA AGATCAACCC CACCAGCGTC 150 0 

ATGAGCGCCA TCTATGGCAG GCCAGTGTCT GCCAAGTTGA TAGGGGACGT GGTGTCAGTG 156 0 

T C AG AT TGT A TCAGTGTTGA C C AAAAGAGC GTGTTTGTGC ACAAAAATAT GAAGGTGCCT 162 0 

GGCAAAGAAG ACC TGT GT T A CACCAGGCCT GTGGTGGGCT TCAAGTTTAT CAATGGGAGC 168 0 

GAACTGTTTG CTGGCCAGCT GGGTCCCAGG AAC GAGATTG TGCTGTCCAC CTCTCAGGTG 174 0 

GAGGTCTGCC AGCACAGCTG C GAGC ACT AC TTCCAGGCCG GGAACCAGAT GTACAAGTAC 180 0 

AAGG AC TACT ACTATGTCAG TACCCTCAAC CTGACTGACA TACCCACCCT ACACACCATG 186 0 

ATT ACC CT GA ACC TGT CT CT G GT AG AGAAT ATAGATTTTA AGGTG AT TG A GCTCTATTCT 192 0 

AAAACAGAGA AAAGGCTGTC CAACGTGTTT G AC AT CG AGA CCATGTTCAG G GAGT AC AAC 198 0 

TACTACACTC AGAACCTCAA CGGGCTGAGG AAGGACC TGG ATGACAGCAT AGATCATGGC 2 04 0 

AGGGACAGCT TCATCCAGAC CCTGGGTGAC ATCATGCAGG ACC TGGGC AC CATAGGCAAG 2 10 0 

GTGGTGGTCA ATGTGGCCAG CGGAGTGTTC TCCCTCTTTG GGAGCATAGT CTCGGGGGTG 2 16 0 

ATAAGCTTTT TCAAAAATCC CTTTGGGGGC ATGCTGCTCA TAGTCCTCAT CATAGCCGGG 2 22 0 

GTAGTGGTGG TGTACCTGTT TATGACCAGG TCCAGGAGCA TATACTCTGC CCCCATTAGA 2 28 0 

ATGCTCTACC CCGGGGTGGA GAGGGCGGCC CAGGAGCCGG GCGCGCACCC G GT GT C AG AA 2 34 0 

GACCAAATCA GGAACATCCT GATGGGAATG CACCAATTTC AGCAGCGGCA GCGGGCGGAA 2 40 0 

GAGGAGGCCC G AC GAG AG G A AGAAGTAAAA GGAAAAAGAA CTCTCTTTGA AGTGATAAGA 2 46 0 

G ACT CT GC GA CCAGCGTTCT GAG GAGGAGA AGAGGGGGTG GTGGGTACCA GCGCCTACAG 2 52 0 

C GAG AC GGGA GCGACGATGA GGGGGATTAT GAGCCATTGA GGCGACAAGA T GG AG GC T AC 2 58 0 

GACGACGTGG ACGTGGAGGC AGGCACGGCG GATACCGGTG TGTAA 2 62 5 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2548 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

ATGTACCCTA CAGTGAAAAG TAT GAGAGTC GCCCACCTAA CCAATCTCCT AAC CC TT CT G 6 0 

TGTCTGCTGT GCCACACGCA TCTCTACGTA TGTCAGCCAA CCACTCTGAG GCAGCCATCA 12 0 

GACATGACCC CAGCCCAGGA CGCTCCAACA GAGACTCCCC CACCCCTCTC AACTAACACT 18 0 

AACAGAGGAT TTGAGTACTT TCGCGTGTGT GGGGTGGCTG C C ACGGGGG A GACCTTCAGG 24 0 

TTTGATTTAG ACAAAACATG CCCCAGTACA CAAGATAAGA AGCATGTGGA GGGCATCTTG 30 0 

CTCGTGTATA AGATCAACAT CGTGCCCTAC ATCTTCAAAA T C AGG AG AT A TAGAAAAATA 36 0 

ATTACTCAAC TGACCATCTG GCGAGGCCTA ACCACTAGTT CAGTCACTGG TAAATTTGAA 42 0 
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ATGGCCACTC AGGCCCACGA GTGGGAAGTG GGCGACTTTG ACAGCATCTA TCAGTGCTAC 48 0 

AATAGCGCCA CCATGGTGGT AAACAACGTC AGACAGGTGT ATGTGGACAG AGATGGGGTC 54 0 

AATAAAACTG TGAACATACG CCCTGTTGAT GGTCTAACAG GGAATATCCA AAGATACTTT 60 0 

AGTCAGCCCA CCCTTTATTC AGAACCTGGT TGGATGCCTG GCTTTTATCG TGTTCGAACC 66 0 

ACCGTTAACT GTGAAATTGT AGACATGGTG GCACGCTCCA TGGATCCCTA TAACTACATC 72 0 

GCTACCGCCC T GGGAG AC AG CCTGGAGCTC TCCCCGTTTC AAACCTTTGA CAACACCAGC 78 0 

CAGTGTACTG CGCCTAAGAG AGCTGATATG AGGGTCAGGG AGGTCAAGAA TTACAAGTTT 84 0 

G TAG AT TATA ATAACAGGGG AACTGCCCCC GCTGGACAAA GCAGGACCTT T CT AG AG AC T 90 0 

CCCTCTGCCA CTTACTCCTG GAAAACAGCC ACCAGACAAA CTGCCACGTG CGACCTGGTG 96 0 

C ACT GG AAAA CATTCCCTCG CGCCATCCAA ACTGCTCATG AACATAGCTA CCATTTTGTG 102 0 

GCCAAT GAAG TCACCGCCAC CTTCAATACA CCCCTGACTG AGGTAGAAAA TTTCACCAGC 108 0 

ACGTATAGCT GCGTCAGTGA CCAGATCAAT AAGACCATCT CTGAATATAT CCAAAAGTTG 114 0 

AACAACTCCT ACGTGGCCAG TGGGAAAACA CAGTATTTCA AGACT GATGG TAACCTGTAC 120 0 

CTCATCTGGC AACCACTCGA ACATCCAGAG ATTGAAGACA TAG AC GAGG A CAGCGACCCA 126 0 

GAACCAACCC CCGCCCCACC AAAGT CC AC A AGGAGAAAAA GAGAGGCAGC TGACAATGGA 132 0 

AACTCAACAT CTGAGGTCTC AAAGGGCTCA GAAAATCCGC TCATTACGGC CCAAATTCAA 138 0 

TTTGCCTATG AC AAGC TG AC CACCAGCGTC AACAACGTGC TTGAGGAGTT GTCCAGGGCG 144 0 

TGGTGTAGAG AACAGGTCAG AGACACCCTC ATGTGGTATG AGCTTAGCAA GGTCAACCCT 150 0 

ACGAGTGTGA TGTCTGCCAT T TATGGAAAG CCTGTCGCTG CCAGGTACGT GGGCGACGCC 156 0 

AT AT CT GT GA C AGACT GT AT CTATGTGGAC CAAAGTTCAG TCAACATCCA CCAGAGCTTG 162 0 

CGGCTGCAGC ATGATAAAAC CACCTGCTAC TCGAGACCTA GAGTCACCTT CAAATTTATA 168 0 

AAC AGT AC AG ACCCGCTAAC TGGCCAGTTG GGTCCTAGAA AAGAAATTAT CCTCTCCAAC 174 0 

ACAAACATAG AAACATGCAA GGATGAGAGT GAACACTACT TCATTGTGGG GGAATACATT 180 0 

TACTATTATA AAAATTACAT TTTTGAAGAA AAGCTAAACC TCTCAAGCAT CGCTACCCTA 186 0 

G AC AC ATT T A TAGCCCTCAA TATCTCATTT ATTGAAAATA TCGACTTCAA AACAGTAGAA 192 0 

CTGTACTCCT CTACTGAAAG GAAACTCGCA TCGAGCGTCT TTGATATAGA ATCCATGTTT 198 0 

AGGGAATATA ACTATTACAC CTACAGCCTC GCGGGCATTA AGAAGGACCT AGACAACACC 2 04 0 

ATCG AC T AC A ATAGAGACAG ACTGGTTCAG G AC C T GT C AG AC ATG AT GGC TGATCTGGGA 2 10 0 

G AC ATT GG AA GATCTGTGGT GAATGTGGTC AGC T C GG TAG T C AC ATT TT T CAGTAGTATT 2 16 0 

GTGACAGGGT TCATTAAATT CTTTACCAAC CCTCTAGGGG G AAT ATT CAT TCTCCTAATT 2 22 0 

ATTGGTGGAA TAATCTTCTT GGTGGTAGTC CTAAATAGAA GAAACTCACA GTTTCACGAT 2 28 0 

GCACCCATCA AAATGCTGTA CCCTTCTGTT GAAAACTACG CTGCCAGACA GGCGCCACCT 2 34 0 

CCCTATAGCG CATCACCTCC AGC TAT AG AC AAAGAGGAAA TTAAGCGCAT ACTTTTGGGC 2 40 0 

ATGCATCAGG TACACCAGGA AGAAAAGGAA GCACAGAAAC AAC T AAC C AA CTCTGGCCCT 2 46 0 

ACTTTGTGGC AGAAAGCCAC AGGATTCCTT AGAAATC GCC GGAAGGGATA CAGCCAACTT 2 52 0 

CCTCTGGAAG ATG AAT C AAC TTCCCTCT 2 54 8 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2572 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

ATGACT CGGC GTAGGGTGCT AAGCGTGGTC GTGCTGCTAG CCGCCCTGGC GTGCCGTCTC 6 0 

GGTGCGCAGA CCCCAGAGCA GCCCGCACCC CCCGCCACCA CGGTGCAGCC TACCGCCACG 12 0 

CGTCAGCAAA CCAGCTTTCC TTTCCGAGTC TGCGAGCTCT CCAGCCACGG CGACCTGTTC 18 0 

CGCTTCTCCT CGGACATCCA GTGTCCCTCG TTTGGCACGC GGGAGAATCA CACGGAGGGC 24 0 

CTGTTGATGG TGTTTAAAGA CAACATTATT CCCTACTCGT TTAAGGTCCG CTCCTACACC 30 0 

AAGATAGTGA CCAACATTCT CAT CT ACAAT GGCTGGTACG CGGACTCCGT GACCAACCGG 36 0 

CACGAGGAGA AGTTCTCCGT T GACAGCT AC GAAACTGACC AGATGGATAC CATCTACCAG 42 0 

TGCTACAACG CGGTCAAGAT G AC AAAAG AT GGGCTGACGC GCGTGTATGT AGACCGCGAC 48 0 

GGAGTTAACA TCACCGTCAA CCTAAAGCCC ACCGGGGGCC TGGCCAACGG GGTGCGCCGC 54 0 

TACGCCAGCC AGACGGAGCT CTATGACGCC CCCGGGTGGT T GAT ATGGAC T T AC AGAAC A 60 0 

AGAACTACCG TCAACTGCCT GATAACTGAC ATGATGGCCA AGTCCAACAG CCCCTTCGAC 66 0 

TTCTTTGTGA CCACCACCGG GCAGACTGTG GAAATGTCCC CTTTCTATGA CGGGAAAAAT 72 0 

AAGGAAACCT TCCATGAGCG GGCAGACTCC TTCCACGTGA GAACTAACTA CAAGATAGTG 78 0 

GACT AC GACA ACC GAGGGAC GAACCCGCAA GGCGAACGCC GAGCCTTCCT GGACAAGGGC 84 0 

ACTTACACGC TATCTTGGAA GCTCGAGAAC AGGACAGCCT ACTGCCCGCT TCAACACTGG 90 0 

CAAACCTTTG ACTCGACCAT C GC C AC AG AA AC AGG GAAGT CAATACATTT TGTGACTGAC 96 0 

GAGGGCACCT CTAGCTTCGT G AC C AAC AC A ACCGTGGGCA TAGAGCTCCC GGACGCCTTC 102 0 

AAGTGCATCG AAGAGCAGGT GAACAAGACC ATGCATGAGA AGT AC GAGGC CGTCCAGGAT 108 0 

CGTTACACGA AGGGCCAGGA AGCCATTACA TATTTTATAA CGAGCGGAGG ATTGTTATTA 114 0 

GCTTGGCTAC CTCTGACCCC GCGCTCGTTG GCCACCGTCA AGAACCTGAC GGAGCTTACC 120 0 

ACTCCGACTT CCTCACCCCC CAGCAGTCCA TCGCCCCCAG CCCCATCCGC GGCCCGCGGG 126 0 

AGCACCCCCG CCGCCGTTCT GAG GC GT C GG AGGCGGGATG CGGGGAACGC CACCACACCG 132 0 

GTGCCCCCCA CGGCCCCCGG GAAGTCCCTG GGCACCCTCA ACAATCCCGC CACCGTCCAG 138 0 

ATCCAATTTG CCTACGACTC CCTGCGCCGC CAGATCAACC GCATGCTGGG AGACCTTGCG 144 0 

CGGGCCTGGT GCCTGGAGCA GAAGAGGCAG AACATGGTGC TGAGAGAACT AACCAAGATT 150 0 

AATCCAACCA CCGTCATGTC CAGCATCTAC GGTAAGGCGG TGGCGGCCAA GCGCCTGGGG 156 0 

GATGTCATCT CAGTCTCCCA GTGCGTGCCC GTTAACCAGG CCACCGTCAC CCTGCGCAAG 162 0 

AGCATGAGGG TCCCTGGCTC CGAGACCATG TGCTACTCGC GCCCCCTGGT GTCCTTCAGC 168 0 

TTTATCAACG ACACCAAGAC CTACGAGGGA CAGCTGGGCA CCGACAACGA GAT CT TC CT C 174 0 

ACAAAAAAGA TGACGGAGGT GTGCCAGGCG ACCAGCCAGT ACTACTTCCA GTCCGGCAAC 180 0 

GAGATCCACG TCTACAACGA CTACCACCAC TTTAAAACCA TCGAGCTGGA CGGCATTGCC 186 0 

ACCCTGCAGA CCTTCATCTC ACTAAACACC TCCCTCATCG AGAAC AT TGA CTTTGCCTCC 192 0 

CTGGAGCTGT ACT CAC GGGA CGAACAGCGT GCCTCCAACG TCTTTGACCT GGAGGGCATC 198 0 

TTCCGGGAGT ACAACTTCCA GGCGCAAAAC ATCGCCGGCC TGCGGAAGGA TTTGG ACAAT 2 04 0 

GCAGTGTCAA ACGGAAGAAA TCAATTCGTG GACGGCCTGG GGGAACTTAT GGACAGTCTG 2 10 0 

GGTAGCGTGG GTCAGTCCAT CACCAACCTA GTCAGCACGG TGGGGGGTTT GTTTAGCAGC 2 16 0 

CTGGTCTCTG GTTTCATCTC C TT CT TC AAA AACCCCTTCG GCGGCATGCT CATTCTGGTC 2 22 0 

CTGGTGGCGG GGGTGGTGAT CCTGGTTATT TCCCTCACGA GGCGCACGCG CCAGATGTCG 2 28 0 

CAGCAGCCGG TGCAGATGCT CTACCCCGGG ATCGACGAGC TCGCTCAGCA ACATGCCTCT 2 34 0 
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GGTGAGGGTC CAGGCATTAA TCCCATTAGT AAGACAGAAT TACAAGCCAT CATGTTAGCG 2 40 0 

CTGCATGAGC AAAACCAGGA G C AAAAG AG A GCAGCTCAGA GGGCGGC CGG ACCCTCAGTG 2 46 0 

GCCAGCAGAG CATTGCAGGC AGCCAGGGAC CGTTTTCCAG GCCTACGCAG AAGACGCTAT 2 52 0 

CACGATCCAG AGACCGCCGC CGCACTGCTT GGGGAGGCAG AGACTGAGTT TT 2 57 2 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2722 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ATGGAATCCA GGATCTGGTG CCTGGTAGTC TGCGTTAACC TGTGTATCGT CTGTCTGGGT 6 0 

GCTGCGGTTT CCTCTTCTAG TACTTCCCAT GCAACTTCTT C TACT C AC AA T GG AAGC CAT 12 0 

ACTTCTCGTA CGACGTCTGC TCAAACCCGG TCAGTCTATT CTCAACACGT AACGTCTTCT 18 0 

GAAGCCGTCA GTCATAGAGC CAACGAGACT ATCTACAACA CTACCCTCAA GTACGGAGAT 24 0 

GTGGTGGGAG T C AAC ACT AC CAAGTACCCC TATCGCGTGT GTTCTATGGC CCAGGGTACG 30 0 

GATCTTATTC GCTTTGAACG TAATATCATC TGCACCTCGA T GAAGCC TAT CAATGAAGAC 36 0 

TTGGATGAGG GCATCATGGT GGTCTACAAG CGCAACATCG T GGCGC AC AC CTTTAAGGTA 42 0 

CGGGTCTACC AAAAGGTTTT GACGTTTCGT CGTAGCTACG CTTACATCTA CACCACTTAT 48 0 

CTGCTGGGCA GCAATACGGA ATACGTGGCG CCTCCTATGT GGG AG AT TC A TCACATCAAC 54 0 

AAGTTTGCTC AAT GCT AC AG TTCCTACAGC C GC GT TAT AG GAGGCACGGT TTTCGTGGCA 60 0 

TATCATAGGG AC AGTT AT GA AAACAAAACC ATGCAATTAA TTCCCGACGA TTATTCCAAC 66 0 

ACCCACAGTA CCCGTTACGT G AC GGTCAAG GATCAGTGGC AC AGC CGCGG CAGCACCTGG 72 0 

CTCTATCGTG AGACCTGTAA TCTGAACTGT ATGCTGACCA TCACTACTGC GCGCTCCAAG 78 0 

TATCCTTATC ATT TTT TT GC AACTTCCACG GGTGATGTGG TTTACATTTC T CC TT TC T AC 84 0 

AACGGAAC C A ATCGCAATGC CAGCTACTTT GGAGAAAACG CCGACAAGTT TTT C ATT TT C 90 0 

C CGAAC T AC A CCATCGTTTC CGACTTTGGA AGACC CAACG CTGCGCCAGA AACCCATAGG 96 0 

TTGGTGGCTT TTCTCGAACG TGCCGACTCG GTGATCTCTT GGGATATACA GGACGAGAAG 102 0 

AATGTCACCT GCCAGCTCAC C TT CT GGG AA GCCTCGGAAC GTACTATCCG T TC CG AAGC C 108 0 

GAAGACTCGT ACCACTTTTC TTCTGCCAAA ATGACTGCAA CTTTTCTGTC TAAGAAACAA 114 0 

GAAGTGAACA TGTCCGACTC CGCGCTGGAC TGCGTACGTG ATGAGGCTAT AAATAAGTTA 120 0 

C AGC AG AT TT T C AAT ACT TC ATACAATCAA ACATATGAAA AAT AC GG AAA CGTGTCCGTC 126 0 

TTCGAAACCA GCGGCGGTCT GGTGGTGTTC TGGCAAGGCA TCAAGCAAAA ATCTTTGGTG 132 0 

GAAT TGGAAC GTTTGGCCAA TCGATCCAGT CTGAATATCA CTCATAGGAC CAGAAGAAGT 138 0 

ACGAGT GAC A ATAATACAAC TCATTTGTCC AGC AT GG AAT C GGTGC AC AA TCTGGTCTAC 144 0 

GCCCAGCTGC AGTTCACCTA TGACACGTTG CGCGGTTACA TCAACCGGGC GCTGGCGCAA 150 0 

ATCGCAGAAG CCTGGTGTGT GGATCAACGG CGCACCCTAG AGGTCTTCAA GGAACTCAGC 156 0 

AAGATCAACC CGTCAGCCAT TCTCTCGGCC ATTTACAACA AACCGATTGC CGCGCGTTTC 162 0 

ATGGGT GATG TCTTGGGCCT GGCCAGCTGC GTGACCATCA ACCAAACCAG CGTCAAGGTG 168 0 

CTGCGTGATA TGAACGTGAA GGAATCGCCA GGACGCTGCT ACTCACGACC CGTGGTCATC 174 0 

T TT AAT TT CG CCAACAGCTC GTACGTGCAG TACGGTCAAC TGGGCGAGGA CAACGAAATC 180 0 
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CTGTTGGGCA ACCACCGCAC TGAGGAATGT CAGCTTCCCA GCCTCAAGAT CTTCATCGCC 186 0 

GGGAACTCGG CCTACGAGTA C GT GG ACT AC CTCTTCAAAC GCATGATTGA CCTCAGCAGT 192 0 

ATCTCCACCG TCGACAGCAT GATCGCCCTG GAT AT CG ACC CGCTGGAAAA T AC CG AC TT C 198 0 

AGGGT ACT GG AACTTTACTC GCAGAAAGAG CTGCGTTCCA GCAACGTTTT TGACCTCGAA 2 04 0 

GAGATC AT GC GCGAATTCAA CTCGTACAAG CAGCGGGTAA AGTACGTGGA GGACAAGGTA 2 10 0 

GTCGACCCGC TACCGCCCTA CCTCAAGGGT CTGGACGACC TCATGAGCGG CCTGGGCGCC 2 16 0 

GCGGGAAAGG CCGTTGGCGT AGCCATTGGG GCCGTGGGTG GCGCGGT GGC CTCCGTGGTC 2 22 0 

GAAGGCGTTG CCACCTTCCT CAAAAACCCC TTCGGAGCCT TCACCATCAT CCTCGTGGCC 2 28 0 

ATAGCCGTAG TCATTATCAC TTATTTGATC TATACTCGAC AGCGGCGTCT GTGCACGCAG 2 34 0 

CCGCTGCAGA ACCTCTTTCC CTATCTGGTG TCCGCCGACG GGACCACCGT GACGTCGGGC 2 40 0 

AGCACCAAAG ACACGTCGTT ACAGGCTCCG CCTTCCTACG AGGAAAGTGT TTATAATTCT 2 46 0 

GGTCGCAAAG GACCGGGACC ACCGTCGTCT GAT GC AT CCA CGGCGGCTCC GCCTTACACC 2 52 0 

AACGAGCAGG CTTACCAGAT GCTTCTGGCC CTGGCCCGTC T GG AC GC AG A GCAGCGAGCG 2 58 0 

CAGCAGAACG GT AC AG AT TC TTTGGACGGA CAGACTGGCA CGCAGGACAA GGGACAGAAG 2 64 0 

CCTAACCTGC TAGACCGGCT GCGACATCGC AAAAACGGCT AC AGAC ACT T GAAAGACTCC 2 70 0 

GACGAAGAAG AGAACGTCTG AA 2 72 2 

(2) INFORMATION FOR SEQ ID NO : 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2493 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 : 

ATGAGCAAGA TGAGAGTATT ATTCCTGGCT GTCTTTTTGA TGAATAGTGT TTTAATGATA 6 0 

T ATT GC GATT CGGATGATTA TAT CAGAGCG GGCTATAATC ACAAATATCC T TT TC GG AT T 12 0 

TGTTCGATTG C CAAAGGC AC TGATTTGATG CGGTTCGACA GAGATATTTC GTGTTCGCCA 18 0 

TATAAGTCTA ATGCAAAGAT GTCGG AGGGT TTTTTCATCA TTTACAAAAC AAATATCGAG 24 0 

ACCTACACTT TTCCAGTGAG AAC AT AT AAA AACGAGCTGA CGTTCCAAAC CAGTTACCGT 30 0 

GATGTGGGTG T GGTTT AT TT TCTGGATCGG ACGGTGATGG GTT TGGC CAT GCCGGTGTAC 36 0 

GAAGCAAATT TAGTTAATTC TCGTGCGCAG TGTTATTCAG CCGTAGCGAT AAAACGACCC 42 0 

GATGGTACGG TGTTTAGTGC CTATCATGAG GATAATAATA AAAACGAAAC TCTAGAATTA 48 0 

TTTCCTCTGA ATTTCAAGTC T GT T ACT AAT AAAAG AT TT A TCACTACGAA AGAACCCTAC 54 0 

TTTGCAAGGG GTCCTTTGTG GCTCTATTCT ACATCGACGT CTCTCAATTG TATTGTGACG 60 0 

GAGGCTACGG CTAAGGCGAA ATATCCGTTT AGTTACTTTG CTTTGACGAC TGGTGAAATC 66 0 

GTGGAAGGGT CTCCGTTCTT CGACGGTTCA AACGGTAAAC ATTTTGCAGA GCCGTTAGAA 72 0 

AAATTGACAA TCTTGGAAAA CTATACTATG ATAGAAGATC TAATGAATGG TATGAATGGG 78 0 

GCTACTACGT TAGTAAGGAA GATCGCTTTT CTGGAGAAAG GGGATACTTT GTTTTCTTGG 84 0 

GAAATCAAGG AAGAGAATGA ATCGGTGTGT ATGCTAAAGC ACT GG AC T AC GGTGACTCAC 90 0 

GGGCTTCGAG CGGAGACGGA T GAGACTT AT CACTTTATTT CTAAGGAGTT GACAGCCGCT 96 0 

TTCGTCGCCT CCAAGGAGTC TTTAAATCTT ACCGATCCCA AACAAACGTG TATTAAGAAT 102 0 

G AAT TT GAGA AGATAATTAC AGATGTCTAT ATGTCAGATT AT AAT GATG A CTACAGCATG 108 0 

AACGGTAGTT ATCAAATTTT TAAGACTACG GGAGATC TGA TTTTGATTTG GCAGCCTCTT 114 0 
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GTGCAAAAAT CTCTTATGGT TCTTGAGCAG GGTTCAGTAA ACT T ACGT AG GAG GC GAG AT 120 0 

TTGGTGGATG TCAAGTCTAG ACATGATATT CTTTATGTGC AATTACAGTA CCTCTATGAT 126 0 

ACTTTGAAAG ATTATATCAA CGATGCCTTG GGGAATTTGG CAGAATCTTG GTGCCTCGAT 132 0 

CAAAAACGAA C GAT AACG AT GTTGCACGAA CTTAGTAAGA TCAGTCCATC GAGTATCGTG 138 0 

TCTGAGGTTT ACGGTCGTCC GATATCTGCA CAGTTGCATG GTGATGTGTT AGCTATCTCG 144 0 

AAAT GC AT AG AAGTTAATCA ATCATCCGTT CAGCTTTATA AG AGT AT GC G GGTCGTCGAT 150 0 

GCGAAGGGAG T AAGGAGT GA AACGATGTGT TATAATCGGC CCTTGGTGAC GTTTAGCTTT 156 0 

GTGAACTCCA CGCCTGAGGT TGTCCTTGGT CAGCTAGGGT TAGATAATGA GATTCTGTTG 162 0 

GGTGATCATA GGACAGAGGA ATGTGAGATA CCTAGTACAA AGATATTTCT ATCTGGAAAT 168 0 

CATGCACACG TGTATACCGA T T AT ACGC AT ACGAATTCGA CGCCCATAGA AGACATTGAG 174 0 

GTATTGGATG C TT TT ATT AG ACTAAAGATC GACCCTCTCG AAAAT GC TG A TTTTAAACTA 180 0 

CTTGATTTAT ATTCGCCGGA CGAATTGAGT AGAGCAAACG TTTTCGATTT AGAGAATATT 186 0 

CTTCGTGAAT ATAACTCATA TAAGAGCGCA C TAT AT ACT A TAGAAGCTAA AATTGCTACT 192 0 

AATACGCCGT CGTATGTCAA TGGGATTAAT TCTTTTTTAC AAGGGCT TGG GGCTATAGGC 198 0 

ACTGGATTGG GCTCGGTTAT AAGTGTTACG GCAGGAGCAC TTGGGGATAT TGTGGGTGGA 2 04 0 

GTGGTGTCTT TTTTAAAAAA TCCATTCGGG GGTGGTCTCA TGTTGATTTT AGC GAT AGT A 2 10 0 

GTTGTCGTTA TAATAATTGT GGTTTTCGTT AGACAAAAAC ATGTGCT TAG TAAGCCTATT 2 16 0 

GACATGATGT TTCCTTATGC CACCAATCCG GTGACTACTG TGTCCAGTGT T AC GGGG AC C 2 22 0 

ACTGTCGTCA AGACGC CT AG TGTTAAAGAT GCTGACGGGG GCACATCTGT TGCGGTTTCG 2 28 0 

GAAAAAGAGG AGGGTATGGC TGACGTCAGT GGACAAATAA GTGGT GATG A AT ATT C AC AA 2 34 0 

GAAGATGCTT TAAAAATGCT CAAGGCCATA AAGTCTTTAG ACGAGTCCTA CAGAAGAAAA 2 40 0 

CCTTCGTCTT C TG AGT CT C A TGCCTCAAAA CCTAGTTTGA TAGACAGGAT CAGGTATAGA 2 46 0 

GGTTATAAGA GTGTAAATGT AGAAGAAGCG TGA 2 49 3 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2608 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGTTTGTTA CGGCGGTTGT GTCGGTCTCT CCAAGCTCGT TTTATGAGAG TTTACAAGTA 6 0 

GAGCCCACAC AATCAGAAGA TATAACCCGG TCTGCTCATC TGGGCGATGG T GATG AAAT C 12 0 

AGAGAAGCTA TACACAAGTC CCAGGACGCC GAAACAAAAC CCACGTTTTA CGTCTGCCCA 18 0 

CCGCCAACAG GCTCCACAAT C GT AC GAT T A GAACCAACTC GGACATGTCC GGATTATCAC 24 0 

CTTGGTAAAA ACTTTACAGA GGGTATTGCT GTTGTTTATA AAGAAAACAT TGCAGCGTAC 30 0 

AAGTTTAAGG CGACGGTATA T T AC AAAG AT GTTATCGTTA GCACGGCGTG GGCCGGAAGT 36 0 

TCTTATACGC AAATTACTAA TAGATATGCG GATAGGGTAC CAATTCCCGT T TC AG AG AT C 42 0 

ACGGAC AC CA TTGATAAGTT TGGCAAGTGT TCTTCTAAAG CAACGTACGT ACGAAATAAC 48 0 

C AC AAAGT TG AAGCCTTTAA T GAGG AT AAA AATCCACAGG ATATGCCTCT AATCGCATCA 54 0 

AAATATAATT CTGTGGGATC C AAAG CAT GG CATACTACCA ATGACACGTA CATGGTTGCC 60 0 

GGAACCCCCG GAACATATAG G AC GG GC AC G TCGGTGAATT GCATCATTGA GGAAGTT GAA 66 0 
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GCCAGATCAA TATTCCCTTA TGATAGTTTT GGACTTTCCA C GGGAGAT AT AAT AT AC AT G 72 0 

TCCCCGTTTT TTGGCCTACG GGATGGTGCA TACAGAGAAC ATTCCAATTA TGCAATGGAT 78 0 

CGTTTTCACC AGTTTGAGGG TTATAGACAA AGGGATC TTG ACACTAGAGC ATTACTGGAA 84 0 

CCTGCAGCGC GGAACTTTTT AGTCACGCCT CATTTAACGG TTGGTTGGAA CTGGAAGCCA 90 0 

AAACGAACGG AAGTTTGTTC GCTTGTCAAG TGGCGTGAGG TTGAAGACGT AGTTCGCGAT 96 0 

GAGTATGCAC ACAATTTTCG CTTTACAATG AAAAC AC TTT CTACCACGTT TATAAGTGAA 102 0 

AC AAAC GAGT TTAATCTTAA CCAAATCCAT CTCAGTCAAT GTGTAAAGGA GGAAGCCCGG 108 0 

GCTATTATTA ACCGGATCTA TACAACCAGA TACAACTCAT CTCATGTTAG AACCGGGGAT 114 0 

ATCCAGACCT ACCTTGCCAG AGGGGGGTTT GTTGTGGTGT TTCAACCCCT GCTGAGCAAT 120 0 

TCCCTCGCCC GTCTCTATCT CCAAGAATTG GTCCGTGAAA ACACTAATCA TTCACCACAA 126 0 

AAAC AC CC GA C TC GAAAT AC CAGATCCCGA CGAAGCGTGC CAGTTGAGTT GCGTGCCAAT 132 0 

AGAAC AAT AA CAACCACCTC ATCGGTGGAA TTTGCTATGC TCCAGTTTAC ATATGACCAC 138 0 

ATTCAAGAGC ATGTT AAT GA AATGTTGGCA CGTATCTCCT CGTCGTGGTG CCAGCTACAA 144 0 

AATCGCGAAC GCGCCCTTTG GAGCGGACTA TTTCCAATTA ACCCAAGTGC TTTAGCGAGC 150 0 

ACC ATT TT GG ATCAACGTGT TAAAGCTCGT ATTCTCGGCG ACGTTATCTC CGTTTCTAAT 156 0 

TGTCCAGAAC TGGGATCAGA T AC AC GC AT T AT ACT TC AAA ACTCTATGAG GGTATCTGGT 162 0 

AGTACTACGC GTT GTT AT AG CCGTCCTTTA ATTTCAATAG TTAGTTTAAA TGGGTCCGGG 168 0 

ACGGTGGAGG GCCAGCTTGG AACAGATAAC GAGTTAATTA TGTCCAGAGA TCTGTTAGAA 174 0 

C CAT GC GT GG CTAATCACAA GCGATATTTT CTATTTGGGC ATCACTACGT ATATTATGAG 180 0 

GATTATCGTT ACGTCCGTGA AATCGCAGTC CATGATGTGG G AATG AT TAG CACTTACGTA 186 0 

GATTTAAACT TAACACTTCT TAAAGATAGA GAGTTTATGC CGCTGCAAGT ATATACAAGA 192 0 

GACGAGCTGC GGGATACAGG ATT AC TAG AC TACAGTGAAA TTCAACGCCG AAATCAAATG 198 0 

CATTCGCTGC GTT TTT AT GA C AT AG AC AAG GTTGTGCAAT ATGATAGCGG AACGGCCATT 2 04 0 

ATGCAGGGCA TGGCTCAGTT TTT CC AGGGA CTTGGGACCG CGGGCCAGGC CGTTGGACAT 2 10 0 

GTGGTTCTTG GGGCCACGGG AGCGCTGCTT TCCACCGTAC ACGGATT T AC CACGTTTTTA 2 16 0 

TCTAACCCAT TTGGGGCATT GGCCGTGGGA TTATTGGTTT TGGCGGGACT GGTAGCGGCC 2 22 0 

T TTT TT GC GT ACC GGT AC GT GCTTAAACTT AAAACAAGCC CGATGAAGGC ATTATATCCA 2 28 0 

CTCACAACCA AGGGGT T AAA ACAGTTACCG GAAGGAATGG ATCCCTTTGC CGAGAAACCC 2 34 0 

AACGCTACTG AT ACCC C AAT AGAAGAAATT GGCGACTCAC AAAAC AC TGA ACCGTCGGTA 2 40 0 

AATAGCGGGT TTGATCCCGA TAAATTTCGA GAAGCCCAGG AAATG AT T AA AT AT ATG AC G 2 46 0 

TTAGTATCTG CGGCTGAGCG C C AAG AAT C T AAAGC CC GC A AAAAAAATAA GACTAGCGCC 2 52 0 

C TTT T AAC TT CACGTCTTAC CGGCCTTGCT TTACGAAATC GCC GAGG AT A CTCCCGTGTT 2 58 0 

C GC ACC GAGA ATGTAACGGG GGT GT AAA 2 60 8 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2713 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



ATGCGCCAGG GCGCCGCGCG GGGGTGCCGG TGGTTCGTCG TATGGGCGCT CTTGGGGTTG 



ACGCTGGGGG TCCTGGTGGC GTCGGCGGCT CCGAGTTCCC CCGGCACGCC TGGGGTCGCG 



60 
120 
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GCCGCGACCC AGGCGGCGAA CGGGGGACCT GCCACTCCGG CGCCGCCCGC CCCTGGCCCC 18 0 

GCCCCAACGG GGGACACGAA ACC GAAGAAG AACAAAAAAC CGAAAAACCC ACCGCCGCCG 24 0 

CGCCCCGCCG GCGACAACGC GACCGTCGCC GCGGGCCACG CCACCCTGCG CGAGCACCTG 30 0 

CGGGACATCA AGGCGGAGAA CACCGATGCA AACTTTTACG TGTGCCCACC CCCCACGGGC 36 0 

GCCACGGTGG TGCAGTTCGA GCAGCCGCGC CGCTGCCCGA CCCGGCCCGA GGGTCAGAAC 42 0 

TACACGGAGG GCATCGCGGT GGTCTTCAAG GAGAACATCG CCCCGTACAA GTTCAAGGCC 48 0 

ACCATGTACT ACAAAGACGT CACCGTTTCG CAGGTGTGGT TCGGCCACCG CTACTCCCAG 54 0 

TTTATGGGGA TCTTTGAGGA CCGCGCCCCC GTCCCCTTCG AGGAGGTGAT CGACAAGATC 60 0 

AACGCCAAGG GGGTCTGTCG GTCCACGGCC AAGTACGTGC GCAACAACCT GGAGACCACC 66 0 

GCGTTTCACC GGGACGACCA CGAGACCGAC ATGGAGCTGA AACCGGCCAA CGCCGCGACC 72 0 

CGCACGAGCC GGGGCTGGCA CACCACCGAC CTCAAGTACA ACCCCTCGCG GGTGGAGGCG 78 0 

TTCCACCGGT ACGGGACGAC GGTAAACTGC ATCGTCGAGG AGGTGGACGC GCGCTCGGTG 84 0 

TACCCGTACG ACGAGTTTGT GCTGGCGACT GGCGACTTTG TGTACATGTC CCCGTTTTAC 90 0 

GGCTACCGGG AGGGGTCGCA CACCGAACAC ACCAGCTACG CCGCCGACCG CTTCAAGCAG 96 0 

GTTGACGGCT TCTACGCGCG CGACCTCACC ACCAAGGCCC GGGCCACGGC GCCGACCACC 102 0 

CGGAACCTGC TCACGACCCC CAAGTTCACC GTGGCCTGGG ACT GGGT GC C AAAGCGCCCG 108 0 

TCGGTCTGCA CCATGACCAA GTGGCAGGAG GTGGACGAGA TGCTGCGCTC CGAGTACGGC 114 0 

GGCTCCTTCC GATTCTCCTC CGACGCCATA TCCACCACCT TCACCACCAA CCTGACCGAG 120 0 

TACCCGCTCT CGCGCGTGGA CCTGGGGGAC TGCATCGGCA AGGACGCCCG CGACGCCATG 126 0 

GACCGCATCT TCGCCCGCAG GTACAACGCG ACGCACATCA AGGTGGGCCA GCCGCAGTAC 132 0 

TACCTGGCCA ATGGGGGCTT TCTGATCGCG TACCAGCCCC TTCTCAGCAA CACGCTCGCG 138 0 

GAGCTGTACG TGCGGGAACA CCTCCGAGAG CAGAGCCGCA AGCCCCCAAA CCCCACGCCC 144 0 

CCGCCGCCCG GGGCCAGCGC CAACGCGTCC GTGGAGCGCA TCAAGACCAC CTCCTCCATC 150 0 

GAGTTCGCCC GGCTGCAGTT TACGTACAAC CACATACAGC GCCATGTCAA CGATATGTTG 156 0 

GGCCGCGTTG CCATCGCGTG GTGCGAGCTG CAGAATCACG AGCTGACCCT GTGGAAC GAG 162 0 

GCCCGCAAGC TGAACCCCAA CGCCATCGCC TCGGCCACCG TGGGCCGGCG GGTGAGCGCG 168 0 

CGGATGCTCG GCGACGTGAT GGCCGTCTCC ACGTGCGTGC CGGTCGCCGC GGACAACGTG 174 0 

ATCGTC C AAA ACTCGATGCG CATCAGCTCG CGGCCCGGGG C CT GC T AC AG CCGCCCCCTG 180 0 

GTCAGCTTTC GGTACGAAGA CCAGGGCCCG TTGGTCGAGG GGCAGGTGGG GGAGAACAAC 186 0 

GAGCTGCGGC TGACGCGCGA T GC GATCGAG CCGTGCACCG TGGGACACCG GCGCTACTTC 192 0 

ACCTTCGGTG GGGGCT AC GT GTACTTCGAG GAGTACGCGT ACTCCCACCA GCTGAGCCGC 198 0 

GCCGACATCA CCACCGTCAG CACCTTCATC GACCTCAACA TCACCATGCT GGAGG AT CAC 2 04 0 

GAGTTTGTCC CCCTGGAGGT GTACACCCGC CAC GAGATC A AGGACAGCGG CCTGCTGGAC 2 10 0 

TACACGGAGG TCCAGCGCCG CAACCAGCTG CACGACCTGC GCTTCGCCGA CAT CG AC AC G 2 16 0 

GTCATCCACG CCGACGCCAA CGCCGCCATG TTCGCGGGCC TGGGCGCGTT CTTCGAGGGG 2 22 0 

ATGGGC GACC TGGGGCGCGC GGTCGGCAAG GTGGTGATGG GCATCGTGGG CGGCGTGGTA 2 28 0 

TCGGCCGTGT CGGGCGTGTC CTCCTTCATG TCCAACCCCT TTGGGGCGCT GGCCGTGGGT 2 34 0 

CTGTTGGTCC TGGCCGGCCT GGCGGCGGCT TTCTTCGCCT TTCGCTACGT CATGCGGCTG 2 40 0 

CAGAGCAACC CCATGAAGGC CCTGTACCCG CTAACCACCA AGGAGCTCAA GAACCCCACC 2 46 0 

AACCCGGACG CGTCCGGGGA GGGCGAGGAG GGCGGCGACT TTGACGAGGC CAAGCTAGCC 2 52 0 
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GAGGCCCGGG AGATGATACG GTACATGGCC CTGGTGTCTG CCATGGAGCG CACGGAACAC 2 58 0 

AAGGCCAAGA AGAAGGGCAC GAGCGCGCTG CTCAGCGCCA AGGTCACCGA CATGGTCATG 2 64 0 

CGCAAGCGCC GCAACACCAA CTACACCCAA GTTCCCAACA AAGACGGTGA C GC CG AC GAG 2 70 0 

GACGACCTGT GAC 2 713 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 08 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Val Pro Asn Lys His Leu Leu Leu lie lie Leu Ser Phe Ser Thr 
15 10 15 

Ala Cys Gly Gin Thr Thr Pro Thr Thr Ala Val Glu Lys Asn Lys Thr 
20 25 30 

Gin Ala lie Tyr Gin Glu Tyr Phe Lys Tyr Arg Val Cys Ser Ala Ser 



Thr Thr Gly Glu Leu Phe Arg Phe Asp Leu Asp Arg Thr Cys Pro Ser 

50 55 60 

Thr Glu Asp Lys Val His Lys Glu Gly lie Leu Leu Val Tyr Lys Lys 
65 70 75 80 

Asn lie Val Pro Tyr lie Phe Lys Val Arg Arg Tyr Lys Lys lie Thr 



Thr Ser Val Arg lie Phe Asn Gly Trp Thr Arg Glu Gly Val Ala lie 
100 105 110 

Thr Asn Lys Trp Glu Leu Ser Arg Ala Val Pro Lys Tyr Glu lie Asp 

115 120 125 

lie Met Asp Lys Thr Tyr Gin Cys His Asn Cys Met Gin lie Glu Val 



Asn Gly Met Leu Asn Ser Tyr Tyr Asp Arg Asp Gly Asn Asn Lys Thr 

145 150 155 160 

Val Asp Leu Lys Pro Val Asp Gly Leu Thr Gly Ala lie Thr Arg Tyr 

165 170 175 

lie Ser Gin Pro Lys Val Phe Ala Asp Pro Gly Trp Leu Trp Gly Thr 



Tyr Arg Thr Arg Thr Thr Val Asn Cys Glu lie Val Asp Met Phe Ala 
195 200 205 

Arg Ser Ala Asp Pro Tyr Thr Tyr Phe Val Thr Ala Leu Gly Asp Thr 



Val Glu Val Ser Pro Phe Cys Asp Val Asp Asn Ser Cys Pro Asn Ala 
225 230 235 240 

Thr Asp Val Leu Ser Val Gin lie Asp Leu Asn His Thr Val Val Asp 

245 250 255 

Tyr Gly Asn Arg Ala Thr Ser Gin Gin His Lys Lys Arg lie Phe Ala 
260 265 270 

His Thr Leu Asp Tyr Ser Val Ser Trp Glu Ala Val Asn Lys Ser Ala 

275 280 285 

Ser Val Cys Ser Met Val Phe Trp Lys Ser Phe Gin Arg Ala lie Gin 



Thr Glu His Asp Leu Thr Tyr His Phe lie Ala Asn Glu lie Thr Ala 
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Gly Phe Ser Thr Val Lys Glu Pro Leu Ala Asn Phe Thr Ser Asp Tyr 
325 330 335 

Asn Cys Leu Met Thr His lie Asn Thr Thr Leu Glu Asp Lys lie Ala 
340 345 350 

Arg Val Asn Asn Thr His Thr Pro Asn Gly Thr Ala Glu Tyr Tyr Gin 
355 360 365 

Thr Glu Gly Gly Met lie Leu Val Trp Gin Pro Leu lie Ala lie Glu 
370 375 330 

Leu Glu Glu Ala Met Leu Glu Ala Thr Thr Ser Pro Val Thr Pro Ser 
385 390 395 400 

Ala Pro Thr Ser Ser Ser Arg Ser Lys Arg Ala lie Arg Ser lie Arg 
405 410 415 

Asp Val Ser Ala Gly Ser Glu Asn Asn Val Phe Leu Ser Gin lie Gin 
420 425 430 

Tyr Ala Tyr Asp Lys Leu Arg Gin Ser lie Asn Asn Val Leu Glu Glu 
435 440 445 

Leu Ala lie Thr Trp Cys Arg Glu Gin Val Arg Gin Thr Met Val Trp 
450 455 460 

Tyr Glu lie Ala Lys lie Asn Pro Thr Ser Val Met Thr Ala lie Tyr 
465 470 475 480 

Gly Lys Pro Val Ser Arg Lys Ala Leu Gly Asp Val lie Ser Val Thr 
485 490 495 

Glu Cys lie Asn Val Asp Gin Ser Ser Val Ser lie His Lys Ser Leu 
500 505 510 

Lys Thr Glu Asn Asn Asp lie Cys Tyr Ser Arg Pro Pro Val Thr Phe 
515 520 525 

Lys Phe Val Asn Ser Ser Gin Leu Phe Lys Gly Gin Leu Gly Ala Arg 
530 535 540 

Asn Glu lie Leu Leu Ser Glu Ser Leu Val Glu Asn Cys His Gin Asn 
545 550 555 560 

Ala Glu Thr Phe Phe Thr Ala Lys Asn Glu Thr Tyr His Phe Lys Asn 
565 570 575 

Tyr Val His Val Glu Thr Leu Pro Val Asn Asn lie Ser Thr Leu Asp 
580 585 590 

Thr Phe Leu Ala Leu Asn Leu Thr Phe lie Glu Asn lie Asp Phe Lys 
595 600 605 

Ala Val Glu Leu Tyr Ser Ser Gly Glu Arg Lys Leu Ala Asn Val Phe 
610 615 620 

Asp Leu Glu Thr Met Phe Arg Glu Tyr Asn Tyr Tyr Ala Gin Ser lie 
625 630 635 640 

Ser Gly Leu Arg Lys Asp Phe Asp Asn Ser Gin Arg Asn Asn Arg Asp 
645 650 655 

Arg lie lie Gin Asp Phe Ser Glu lie Leu Ala Asp Leu Gly Ser lie 
660 665 670 

Gly Lys Val lie Val Asn Val Ala Ser Gly Ala Phe Ser Leu Phe Gly 
675 680 685 

Gly lie Val Thr Gly lie Leu Asn Phe lie Lys Asn Pro Leu Gly Gly 
690 695 700 

Met Phe Thr Phe Leu Leu lie Gly Ala Val lie lie Leu Val lie Leu 
705 710 715 720 

Leu Val Arg Arg Thr Asn Asn Met Ser Gin Ala Pro lie Arg Met lie 
725 730 735 
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Tyr Pro Asp Val Glu Lys Ser Lys Ser Thr Val Thr Pro Met Glu Pro 
740 745 750 

Glu Thr lie Lys Gin lie Leu Leu Gly Met His Asn Met Gin Gin Glu 

755 760 765 

Ala Tyr Lys Lys Lys Glu Glu Gin Arg Ala Ala Arg Pro Ser lie Phe 

770 775 730 

Arg Gin Ala Ala Glu Thr Phe Leu Arg Lys Arg Ser Gly Tyr Lys Gin 



lie Ser Thr Glu Asp Lys lie Val 
805 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 74 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Tyr Tyr Lys Thr lie Leu Phe Phe Ala Leu lie Lys Val Cys Ser 
15 10 15 

Phe Asn Gin Thr Thr Thr His Ser Thr Thr Thr Ser Pro Ser lie Ser 
20 25 30 

Ser Thr Thr Ser Ser Thr Thr Thr Ser Thr Ser Lys Pro Ser Asn Thr 
35 40 45 

Thr Ser Thr Asn Ser Ser Leu Ala Ala Ser Pro Gin Asn Thr Ser Thr 
50 55 60 

Ser Lys Pro Ser Thr Asp Asn Gin Gly Thr Ser Thr Pro Thr lie Pro 

65 70 75 80 

Thr Val Thr Asp Asp Thr Ala Ser Lys Asn Phe Tyr Lys Tyr Arg Val 
85 90 95 

Cys Ser Ala Ser Ser Ser Ser Gly Glu Leu Phe Arg Phe Asp Leu Asp 
100 105 110 

Gin Thr Cys Pro Asp Thr Lys Asp Lys Lys His Val Glu Gly lie Leu 
115 120 125 

Leu Val Leu Lys Lys Asn lie Val Pro Tyr lie Phe Lys Val Arg Lys 
130 135 140 

Tyr Arg Lys lie Ala Thr Ser Val Thr Val Tyr Arg Gly Trp Ser Gin 

145 150 155 160 

Ala Ala Val Thr Asn Arg Asp Asp lie Ser Arg Ala lie Pro Tyr Asn 
165 170 175 

Glu lie Ser Met lie Asp Arg Thr Tyr His Cys Phe Ser Ala Met Ala 
180 185 190 

Thr Val lie Asn Gly lie Leu Asn Thr Tyr lie Asp Arg Asp Ser Glu 
195 200 205 

Asn Lys Ser Val Pro Leu Gin Pro Val Ala Gly Leu Thr Glu Asn lie 
210 215 220 

Asn Arg Tyr Phe Ser Gin Pro Leu lie Tyr Ala Glu Pro Gly Trp Phe 

225 230 235 240 

Pro Gly lie Tyr Arg Val Arg Thr Thr Val Asn Cys Glu Val Val Asp 
245 250 255 

Met Tyr Ala Arg Ser Val Glu Pro Tyr Thr His Phe lie Thr Ala Leu 
260 265 270 



Gly Asp Thr lie Glu lie Ser Pro Phe Cys His Asn Asn Ser Gin Cys 
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Thr Thr Gly Asn Ser Thr Ser Arg Asp Ala Thr Lys Val Trp lie Glu 
290 295 300 

Glu Asn His Gin Thr Val Asp Tyr Glu Arg Arg Gly His Pro Thr Lys 
305 310 315 320 

Asp Lys Arg lie Phe Leu Lys Asp Glu Glu Tyr Thr lie Ser Trp Lys 
325 330 335 

Ala Glu Asp Arg Glu Arg Ala lie Cys Asp Phe Val lie Trp Lys Thr 
340 345 350 

Phe Pro Arg Ala lie Gin Thr lie His Asn Glu Ser Phe His Phe Val 
355 360 365 

Ala Asn Glu Val Thr Ala Ser Phe Leu Thr Ser Asn Gin Glu Glu Thr 
370 375 330 

Glu Leu Arg Gly Asn Thr Glu lie Leu Asn Cys Met Asn Ser Thr lie 
385 390 395 400 

Asn Glu Thr Leu Glu Glu Thr Val Lys Lys Phe Asn Lys Ser His lie 
405 410 415 

Arg Asp Gly Glu Val Lys Tyr Tyr Lys Thr Asn Gly Gly Leu Phe Leu 
420 425 430 

lie Trp Gin Ala Met Lys Pro Leu Asn Leu Ser Glu His Thr Asn Tyr 
435 440 445 

Thr lie Glu Arg Asn Asn Lys Thr Gly Asn Lys Ser Arg Gin Lys Arg 
450 455 460 

Ser Val Asp Thr Lys Thr Phe Gin Gly Ala Lys Gly Leu Ser Thr Ala 
465 470 475 480 

Gin Val Gin Tyr Ala Tyr Asp His Leu Arg Thr Ser Met Asn His lie 
485 490 495 

Leu Glu Glu Leu Thr Lys Thr Trp Cys Arg Glu Gin Lys Lys Asp Asn 
500 505 510 

Leu Met Trp Tyr Glu Leu Ser Lys lie Asn Pro Val Ser Val Met Ala 
515 520 525 

Ala lie Tyr Gly Lys Pro Val Ala Val Lys Ala Met Gly Asp Ala Phe 
530 535 540 

Met Val Ser Glu Cys lie Asn Val Asp Gin Ala Ser Val Asn lie His 
545 550 555 560 

Lys Ser Met Arg Thr Asp Asp Pro Lys Val Cys Tyr Ser Arg Pro Leu 
565 570 575 

Val Thr Phe Lys Phe Val Asn Ser Thr Ala Thr Phe Arg Gly Gin Leu 
580 585 590 

Gly Thr Arg Asn Glu lie Leu Leu Thr Asn Thr His Val Glu Thr Cys 
595 600 605 

Arg Pro Thr Ala Asp His Tyr Phe Phe Val Lys Asn Met Thr His Tyr 
610 615 620 

Phe Lys Asp Tyr Lys Phe Val Lys Thr Met Asp Thr Asn Asn lie Ser 
625 630 635 640 

Thr Leu Asp Thr Phe Leu Thr Leu Asn Leu Thr Phe lie Asp Asn lie 
645 650 655 

Asp Phe Lys Thr Val Glu Leu Tyr Ser Glu Thr Glu Arg Lys Met Ala 
660 665 670 

Ser Ala Leu Asp Leu Glu Thr Met Phe Arg Glu Tyr Asn Tyr Tyr Thr 
675 680 685 

Gin Lys Leu Ala Ser Leu Arg Glu Asp Leu Asp Asn Thr lie Asp Leu 
690 695 700 
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Asn Arg Asp Arg Leu Val Lys Asp Leu Ser Glu Met Met Ala Asp Leu 
705 710 715 720 

Gly Asp lie Gly Lys Val Val Val Asn Thr Phe Ser Gly lie Val Thr 
725 730 735 

Val Phe Gly Ser lie Val Gly Gly Phe Val Ser Phe Phe Thr Asn Pro 
740 745 750 

lie Gly Gly Val Thr lie lie Leu Leu Leu lie Val Val Val Phe Val 
755 760 765 

Val Phe lie Val Ser Arg Arg Thr Asn Asn Met Asn Glu Ala Pro lie 
770 775 730 

Lys Met lie Tyr Pro Asn lie Asp Lys Ala Ser Glu Gin Glu Asn lie 
785 790 795 800 

Gin Pro Leu Pro Gly Glu Glu lie Lys Arg lie Leu Leu Gly Met His 
805 810 815 

Gin Leu Gin Gin Ser Glu His Gly Lys Ser Glu Glu Glu Ala Ser His 
820 825 830 

Lys Pro Gly Leu Phe Gin Leu Leu Gly Asp Gly Leu Gin Leu Leu Arg 
835 840 845 

Arg Arg Gly Tyr Thr Arg Leu Pro Thr Phe Asp Pro Ser Pro Gly Asn 
850 855 860 

Asp Thr Ser Glu Thr His Gin Lys Tyr Val 
865 870 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 74 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Gly Val Gly Gly Gly Pro Arg Val Val Leu Cys Leu Trp Cys Val 
15 10 15 

Ala Ala Leu Leu Cys Gin Gly Val Ala Gin Glu Val Val Ala Glu Thr 
20 25 30 

Thr Thr Pro Phe Ala Thr His Arg Pro Glu Val Val Ala Glu Glu Asn 
35 40 45 

Pro Ala Asn Pro Phe Leu Pro Phe Arg Val Cys Gly Ala Ser Pro Thr 



Gly Gly Glu lie Phe Arg Phe Pro Leu Glu Glu Ser Cys Pro Asn Thr 
65 70 75 80 

Glu Asp Lys Asp His lie Glu Gly lie Ala Leu lie Tyr Lys Thr Asn 

85 90 95 

lie Val Pro Tyr Val Phe Asn Val Arg Lys Tyr Arg Lys lie Met Thr 
100 105 110 

Ser Thr Thr lie Tyr Lys Gly Trp Ser Glu Asp Ala lie Thr Asn Gin 

115 120 125 

His Thr Arg Ser Tyr Ala Val Pro Leu Tyr Glu Val Gin Met Met Asp 



His Tyr Tyr Gin Cys Phe Ser Ala Val Gin Val Asn Glu Gly Gly His 

145 150 155 160 

Val Asn Thr Tyr Tyr Asp Arg Asp Gly Trp Asn Glu Thr Ala Phe Leu 

165 170 175 



Lys Pro Ala Asp Gly Leu Thr Ser Ser lie Thr Arg Tyr Gin Ser Gin 
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Pro Glu Val Tyr Ala Thr Pro Arg Asn Leu Leu Trp Ser Tyr Thr Thr 
195 200 205 

Arg Thr Thr Val Asn Cys Glu Val Thr Glu Met Ser Ala Arg Ser Met 
210 215 220 

Lys Pro Phe Glu Phe Phe Val Thr Ser Val Gly Asp Thr lie Glu Met 
225 230 235 240 

Ser Pro Phe Leu Lys Glu Asn Gly Thr Glu Pro Glu Lys lie Leu Lys 
245 250 255 

Arg Pro His Ser lie Gin Leu Leu Lys Asn Tyr Ala Val Thr Lys Tyr 
260 265 270 

Gly Val Gly Leu Gly Gin Ala Asp Asn Ala Thr Arg Phe Phe Ala lie 
275 280 285 

Phe Gly Asp Tyr Ser Leu Ser Trp Lys Ala Thr Thr Glu Asn Ser Ser 
290 295 300 

Tyr Cys Asp Leu lie Leu Trp Lys Gly Phe Ser Asn Ala lie Gin Thr 
305 310 315 320 

Gin His Asn Ser Ser Leu His Phe lie Ala Asn Asp lie Thr Ala Ser 
325 330 335 

Phe Ser Thr Pro Leu Glu Glu Glu Ala Asn Phe Asn Glu Thr Phe Lys 
340 345 350 

Cys lie Trp Asn Asn Thr Gin Glu Glu lie Gin Lys Lys Leu Lys Glu 
355 360 365 

Val Glu Lys Thr His Arg Pro Asn Gly Thr Ala Lys Val Tyr Lys Thr 
370 375 330 

Thr Gly Asn Leu Tyr lie Val Trp Gin Pro Leu lie Gin lie Asp Leu 
385 390 395 400 

Leu Asp Thr His Ala Lys Leu Tyr Asn Leu Thr Asn Ala Thr Ala Ser 
405 410 415 

Pro Thr Ser Thr Pro Thr Thr Ser Pro Arg Arg Arg Arg Arg Asp Thr 
420 425 430 

Ser Ser Val Ser Gly Gly Gly Asn Asn Gly Asp Asn Ser Thr Lys Glu 
435 440 445 

Glu Ser Val Ala Ala Ser Gin Val Gin Phe Ala Tyr Asp Asn Leu Arg 
450 455 460 

Lys Ser lie Asn Arg Val Leu Gly Glu Leu Ser Arg Ala Trp Cys Arg 
465 470 475 480 

Glu Gin Tyr Arg Ala Ser Leu Met Trp Tyr Glu Leu Ser Lys lie Asn 
485 490 495 

Pro Thr Ser Val Met Ser Ala lie Tyr Gly Arg Pro Val Ser Ala Lys 
500 505 510 

Leu lie Gly Asp Val Val Ser Val Ser Asp Cys lie Ser Val Asp Gin 
515 520 525 

Lys Ser Val Phe Val His Lys Asn Met Lys Val Pro Gly Lys Glu Asp 
530 535 540 

Leu Cys Tyr Thr Arg Pro Val Val Gly Phe Lys Phe lie Asn Gly Ser 
545 550 555 560 

Glu Leu Phe Ala Gly Gin Leu Gly Pro Arg Asn Glu lie Val Leu Ser 
565 570 575 

Thr Ser Gin Val Glu Val Cys Gin His Ser Cys Glu His Tyr Phe Gin 
580 585 590 

Ala Gly Asn Gin Met Tyr Lys Tyr Lys Asp Tyr Tyr Tyr Val Ser Thr 
595 600 605 
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Leu Asn Leu Thr Asp lie Pro Thr Leu His Thr Met lie Thr Leu Asn 
610 615 620 

Leu Ser Leu Val Glu Asn lie Asp Phe Lys Val lie Glu Leu Tyr Ser 



Lys Thr Glu Lys Arg Leu Ser Asn Val Phe Asp lie Glu Thr Met Phe 

645 650 655 

Arg Glu Tyr Asn Tyr Tyr Thr Gin Asn Leu Asn Gly Leu Arg Lys Asp 

660 665 670 

Leu Asp Asp Ser lie Asp His Gly Arg Asp Ser Phe lie Gin Thr Leu 



Gly Asp lie Met Gin Asp Leu Gly Thr lie Gly Lys Val Val Val Asn 
690 695 700 

Val Ala Ser Gly Val Phe Ser Leu Phe Gly Ser lie Val Ser Gly Val 
705 710 715 720 

lie Ser Phe Phe Lys Asn Pro Phe Gly Gly Met Leu Leu lie Val Leu 

725 730 735 

lie lie Ala Gly Val Val Val Val Tyr Leu Phe Met Thr Arg Ser Arg 
740 745 750 

Ser lie Tyr Ser Ala Pro lie Arg Met Leu Tyr Pro Gly Val Glu Arg 
755 760 765 

Ala Ala Gin Glu Pro Gly Ala His Pro Val Ser Glu Asp Gin lie Arg 
770 775 730 

Asn lie Leu Met Gly Met His Gin Phe Gin Gin Arg Gin Arg Ala Glu 



Glu Glu Ala Arg Arg Glu Glu Glu Val Lys Gly Lys Arg Thr Leu Phe 
805 810 815 

Glu Val lie Arg Asp Ser Ala Thr Ser Val Leu Arg Arg Arg Arg Gly 

820 825 830 

Gly Gly Gly Tyr Gin Arg Leu Gin Arg Asp Gly Ser Asp Asp Glu Gly 

835 840 845 

Asp Tyr Glu Pro Leu Arg Arg Gin Asp Gly Gly Tyr Asp Asp Val Asp 



Val Glu Ala Gly Thr Ala Asp Thr Gly Val 
865 870 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Met Tyr Pro Thr Val Lys Ser Met Arg Val Ala His Leu Thr Asn Leu 
15 10 15 

Leu Thr Leu Leu Cys Leu Leu Cys His Thr His Leu Tyr Val Cys Gin 
20 25 30 

Pro Thr Thr Leu Arg Gin Pro Ser Asp Met Thr Pro Ala Gin Asp Ala 
35 40 45 

Pro Thr Glu Thr Pro Pro Pro Leu Ser Thr Asn Thr Asn Arg Gly Phe 
50 55 60 

Glu Tyr Phe Arg Val Cys Gly Val Ala Ala Thr Gly Glu Thr Phe Arg 
65 70 75 80 



Phe Asp Leu Asp Lys Thr Cys Pro Ser Thr Gin Asp Lys Lys His Val 
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Glu Gly lie Leu Leu Val Tyr Lys lie Asn lie Val Pro Tyr lie Phe 
100 105 110 

Lys lie Arg Arg Tyr Arg Lys lie lie Thr Gin Leu Thr lie Trp Arg 
115 120 125 

Gly Leu Thr Thr Ser Ser Val Thr Gly Lys Phe Glu Met Ala Thr Gin 
130 135 140 

Ala His Glu Trp Glu Val Gly Asp Phe Asp Ser lie Tyr Gin Cys Tyr 
145 150 155 160 

Asn Ser Ala Thr Met Val Val Asn Asn Val Arg Gin Val Tyr Val Asp 
165 170 175 

Arg Asp Gly Val Asn Lys Thr Val Asn lie Arg Pro Val Asp Gly Leu 
180 185 190 

Thr Gly Asn lie Gin Arg Tyr Phe Ser Gin Pro Thr Leu Tyr Ser Glu 
195 200 205 

Pro Gly Trp Met Pro Gly Phe Tyr Arg Val Arg Thr Thr Val Asn Cys 
210 215 220 

Glu lie Val Asp Met Val Ala Arg Ser Met Asp Pro Tyr Asn Tyr lie 
225 230 235 240 

Ala Thr Ala Leu Gly Asp Ser Leu Glu Leu Ser Pro Phe Gin Thr Phe 
245 250 255 

Asp Asn Thr Ser Gin Cys Thr Ala Pro Lys Arg Ala Asp Met Arg Val 
260 265 270 

Arg Glu Val Lys Asn Tyr Lys Phe Val Asp Tyr Asn Asn Arg Gly Thr 
275 280 285 

Ala Pro Ala Gly Gin Ser Arg Thr Phe Leu Glu Thr Pro Ser Ala Thr 
290 295 300 

Tyr Ser Trp Lys Thr Ala Thr Arg Gin Thr Ala Thr Cys Asp Leu Val 
305 310 315 320 

His Trp Lys Thr Phe Pro Arg Ala lie Gin Thr Ala His Glu His Ser 
325 330 335 

Tyr His Phe Val Ala Asn Glu Val Thr Ala Thr Phe Asn Thr Pro Leu 
340 345 350 

Thr Glu Val Glu Asn Phe Thr Ser Thr Tyr Ser Cys Val Ser Asp Gin 
355 360 365 

lie Asn Lys Thr lie Ser Glu Tyr lie Gin Lys Leu Asn Asn Ser Tyr 
370 375 330 

Val Ala Ser Gly Lys Thr Gin Tyr Phe Lys Thr Asp Gly Asn Leu Tyr 
385 390 395 400 

Leu lie Trp Gin Pro Leu Glu His Pro Glu lie Glu Asp lie Asp Glu 
405 410 415 

Asp Ser Asp Pro Glu Pro Thr Pro Ala Pro Pro Lys Ser Thr Arg Arg 
420 425 430 

Lys Arg Glu Ala Ala Asp Asn Gly Asn Ser Thr Ser Glu Val Ser Lys 
435 440 445 

Gly Ser Glu Asn Pro Leu lie Thr Ala Gin lie Gin Phe Ala Tyr Asp 
450 455 460 

Lys Leu Thr Thr Ser Val Asn Asn Val Leu Glu Glu Leu Ser Arg Ala 
465 470 475 480 

Trp Cys Arg Glu Gin Val Arg Asp Thr Leu Met Trp Tyr Glu Leu Ser 
485 490 495 

Lys Val Asn Pro Thr Ser Val Met Ser Ala lie Tyr Gly Lys Pro Val 
500 505 510 
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Ala Ala Arg Tyr Val Gly Asp Ala lie Ser Val Thr Asp Cys lie Tyr 
515 520 525 

Val Asp Gin Ser Ser Val Asn lie His Gin Ser Leu Arg Leu Gin His 
530 535 540 

Asp Lys Thr Thr Cys Tyr Ser Arg Pro Arg Val Thr Phe Lys Phe lie 
545 550 555 560 

Asn Ser Thr Asp Pro Leu Thr Gly Gin Leu Gly Pro Arg Lys Glu lie 
565 570 575 

lie Leu Ser Asn Thr Asn lie Glu Thr Cys Lys Asp Glu Ser Glu His 
580 585 590 

Tyr Phe lie Val Gly Glu Tyr lie Tyr Tyr Tyr Lys Asn Tyr lie Phe 
595 600 605 

Glu Glu Lys Leu Asn Leu Ser Ser lie Ala Thr Leu Asp Thr Phe lie 
610 615 620 

Ala Leu Asn lie Ser Phe lie Glu Asn lie Asp Phe Lys Thr Val Glu 
625 630 635 640 

Leu Tyr Ser Ser Thr Glu Arg Lys Leu Ala Ser Ser Val Phe Asp lie 
645 650 655 

Glu Ser Met Phe Arg Glu Tyr Asn Tyr Tyr Thr Tyr Ser Leu Ala Gly 
660 665 670 

lie Lys Lys Asp Leu Asp Asn Thr lie Asp Tyr Asn Arg Asp Arg Leu 
675 680 685 

Val Gin Asp Leu Ser Asp Met Met Ala Asp Leu Gly Asp lie Gly Arg 
690 695 700 

Ser Val Val Asn Val Val Ser Ser Val Val Thr Phe Phe Ser Ser lie 
705 710 715 720 

Val Thr Gly Phe lie Lys Phe Phe Thr Asn Pro Leu Gly Gly lie Phe 
725 730 735 

lie Leu Leu lie lie Gly Gly lie lie Phe Leu Val Val Val Leu Asn 
740 745 750 

Arg Arg Asn Ser Gin Phe His Asp Ala Pro lie Lys Met Leu Tyr Pro 
755 760 765 

Ser Val Glu Asn Tyr Ala Ala Arg Gin Ala Pro Pro Pro Tyr Ser Ala 
770 775 730 

Ser Pro Pro Ala lie Asp Lys Glu Glu lie Lys Arg lie Leu Leu Gly 
785 790 795 800 

Met His Gin Val His Gin Glu Glu Lys Glu Ala Gin Lys Gin Leu Thr 
805 810 815 

Asn Ser Gly Pro Thr Leu Trp Gin Lys Ala Thr Gly Phe Leu Arg Asn 
820 825 830 

Arg Arg Lys Gly Tyr Ser Gin Leu Pro Leu Glu Asp Glu Ser Thr Ser 
835 840 845 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 85 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



Met Thr Arg Arg Arg Val Leu Ser Val Val Val Leu Leu Ala Ala Leu 
15 10 15 
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Ala Cys Arg Leu Gly Ala Gin Thr Pro Glu Gin Pro Ala Pro Pro Ala 
20 25 30 

Thr Thr Val Gin Pro Thr Ala Thr Arg Gin Gin Thr Ser Phe Pro Phe 
35 40 45 

Arg Val Cys Glu Leu Ser Ser His Gly Asp Leu Phe Arg Phe Ser Ser 
50 55 60 

Asp lie Gin Cys Pro Ser Phe Gly Thr Arg Glu Asn His Thr Glu Gly 
65 70 75 80 

Leu Leu Met Val Phe Lys Asp Asn lie lie Pro Tyr Ser Phe Lys Val 
85 90 95 

Arg Ser Tyr Thr Lys lie Val Thr Asn lie Leu lie Tyr Asn Gly Trp 
100 105 110 

Tyr Ala Asp Ser Val Thr Asn Arg His Glu Glu Lys Phe Ser Val Asp 
115 120 125 

Ser Tyr Glu Thr Asp Gin Met Asp Thr lie Tyr Gin Cys Tyr Asn Ala 
130 135 140 

Val Lys Met Thr Lys Asp Gly Leu Thr Arg Val Tyr Val Asp Arg Asp 
145 150 155 160 

Gly Val Asn lie Thr Val Asn Leu Lys Pro Thr Gly Gly Leu Ala Asn 
165 170 175 

Gly Val Arg Arg Tyr Ala Ser Gin Thr Glu Leu Tyr Asp Ala Pro Gly 
180 185 190 

Trp Leu lie Trp Thr Tyr Arg Thr Arg Thr Thr Val Asn Cys Leu lie 
195 200 205 

Thr Asp Met Met Ala Lys Ser Asn Ser Pro Phe Asp Phe Phe Val Thr 
210 215 220 

Thr Thr Gly Gin Thr Val Glu Met Ser Pro Phe Tyr Asp Gly Lys Asn 

225 230 235 240 

Lys Glu Thr Phe His Glu Arg Ala Asp Ser Phe His Val Arg Thr Asn 
245 250 255 

Tyr Lys lie Val Asp Tyr Asp Asn Arg Gly Thr Asn Pro Gin Gly Glu 
260 265 270 

Arg Arg Ala Phe Leu Asp Lys Gly Thr Tyr Thr Leu Ser Trp Lys Leu 
275 280 285 

Glu Asn Arg Thr Ala Tyr Cys Pro Leu Gin His Trp Gin Thr Phe Asp 
290 295 300 

Ser Thr lie Ala Thr Glu Thr Gly Lys Ser lie His Phe Val Thr Asp 
305 310 315 320 

Glu Gly Thr Ser Ser Phe Val Thr Asn Thr Thr Val Gly lie Glu Leu 
325 330 335 

Pro Asp Ala Phe Lys Cys lie Glu Glu Gin Val Asn Lys Thr Met His 
340 345 350 

Glu Lys Tyr Glu Ala Val Gin Asp Arg Tyr Thr Lys Gly Gin Glu Ala 
355 360 365 

lie Thr Tyr Phe lie Thr Ser Gly Gly Leu Leu Leu Ala Trp Leu Pro 
370 375 380 

Leu Thr Pro Arg Ser Leu Ala Thr Val Lys Asn Leu Thr Glu Leu Thr 
385 390 395 400 

Thr Pro Thr Ser Ser Pro Pro Ser Ser Pro Ser Pro Pro Ala Pro Ser 
405 410 415 

Ala Ala Arg Gly Ser Thr Pro Ala Ala Val Leu Arg Arg Arg Arg Arg 
420 425 430 

Asp Ala Gly Asn Ala Thr Thr Pro Val Pro Pro Thr Ala Pro Gly Lys 
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Ser Leu Gly Thr Leu Asn Asn Pro Ala Thr Val Gin lie Gin Phe Ala 
450 455 460 

Tyr Asp Ser Leu Arg Arg Gin lie Asn Arg Met Leu Gly Asp Leu Ala 
465 470 475 480 

Arg Ala Trp Cys Leu Glu Gin Lys Arg Gin Asn Met Val Leu Arg Glu 
485 490 495 

Leu Thr Lys lie Asn Pro Thr Thr Val Met Ser Ser lie Tyr Gly Lys 
500 505 510 

Ala Val Ala Ala Lys Arg Leu Gly Asp Val lie Ser Val Ser Gin Cys 
515 520 525 

Val Pro Val Asn Gin Ala Thr Val Thr Leu Arg Lys Ser Met Arg Val 
530 535 540 

Pro Gly Ser Glu Thr Met Cys Tyr Ser Arg Pro Leu Val Ser Phe Ser 
545 550 555 560 

Phe lie Asn Asp Thr Lys Thr Tyr Glu Gly Gin Leu Gly Thr Asp Asn 
565 570 575 

Glu lie Phe Leu Thr Lys Lys Met Thr Glu Val Cys Gin Ala Thr Ser 
580 585 590 

Gin Tyr Tyr Phe Gin Ser Gly Asn Glu lie His Val Tyr Asn Asp Tyr 
595 600 605 

His His Phe Lys Thr lie Glu Leu Asp Gly lie Ala Thr Leu Gin Thr 
610 615 620 

Phe lie Ser Leu Asn Thr Ser Leu lie Glu Asn lie Asp Phe Ala Ser 
625 630 635 640 

Leu Glu Leu Tyr Ser Arg Asp Glu Gin Arg Ala Ser Asn Val Phe Asp 
645 650 655 

Leu Glu Gly lie Phe Arg Glu Tyr Asn Phe Gin Ala Gin Asn lie Ala 
660 665 670 

Gly Leu Arg Lys Asp Leu Asp Asn Ala Val Ser Asn Gly Arg Asn Gin 
675 680 685 

Phe Val Asp Gly Leu Gly Glu Leu Met Asp Ser Leu Gly Ser Val Gly 
690 695 700 

Gin Ser lie Thr Asn Leu Val Ser Thr Val Gly Gly Leu Phe Ser Ser 
705 710 715 720 

Leu Val Ser Gly Phe lie Ser Phe Phe Lys Asn Pro Phe Gly Gly Met 
725 730 735 

Leu lie Leu Val Leu Val Ala Gly Val Val lie Leu Val lie Ser Leu 
740 745 750 

Thr Arg Arg Thr Arg Gin Met Ser Gin Gin Pro Val Gin Met Leu Tyr 
755 760 765 

Pro Gly lie Asp Glu Leu Ala Gin Gin His Ala Ser Gly Glu Gly Pro 
770 775 780 

Gly lie Asn Pro lie Ser Lys Thr Glu Leu Gin Ala lie Met Leu Ala 
785 790 795 800 

Leu His Glu Gin Asn Gin Glu Gin Lys Arg Ala Ala Gin Arg Ala Ala 
805 810 815 

Gly Pro Ser Val Ala Ser Arg Ala Leu Gin Ala Ala Arg Asp Arg Phe 
820 825 830 

Pro Gly Leu Arg Arg Arg Arg Tyr His Asp Pro Glu Thr Ala Ala Ala 
835 840 845 

Leu Leu Gly Glu Ala Glu Thr Glu Phe 
850 855 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 07 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Met Glu Ser Arg lie Trp Cys Leu Val Val Cys Val Asn Leu Cys lie 
15 10 15 

Val Cys Leu Gly Ala Ala Val Ser Ser Ser Ser Thr Arg Gly Thr Ser 
20 25 30 

Ala Thr His Ser His His Ser Ser His Thr Thr Ser Ala Ala His Ser 
35 40 45 

Arg Ser Gly Ser Val Ser Gin Arg Val Thr Ser Ser Gin Thr Val Ser 
50 55 60 

His Gly Val Asn Glu Thr lie Tyr Asn Thr Thr Leu Lys Tyr Gly Asp 
65 70 75 80 

Val Val Gly Val Asn Thr Thr Lys Tyr Pro Tyr Arg Val Cys Ser Met 
85 90 95 

Ala Gin Gly Thr Asp Leu lie Arg Phe Glu Arg Asn lie Val Cys Thr 
100 105 110 

Ser Met Lys Pro lie Asn Glu Asp Leu Asp Glu Gly lie Met Val Val 
115 120 125 

Tyr Lys Arg Asn lie Val Ala His Thr Phe Lys Val Arg Val Tyr Gin 
130 135 140 

Lys Val Leu Thr Phe Arg Arg Ser Tyr Ala Tyr lie His Thr Thr Tyr 
145 150 155 160 

Leu Leu Gly Ser Asn Thr Glu Tyr Val Ala Pro Pro Met Trp Glu lie 
165 170 175 

His His lie Asn Ser His Ser Gin Cys Tyr Ser Ser Tyr Ser Arg Val 
180 185 190 

lie Ala Gly Thr Val Phe Val Ala Tyr His Arg Asp Ser Tyr Glu Asn 
195 200 205 

Lys Thr Met Gin Leu Met Pro Asp Asp Tyr Ser Asn Thr His Ser Thr 
210 215 220 

Arg Tyr Val Thr Val Lys Asp Gin Trp His Ser Arg Gly Ser Thr Trp 
225 230 235 240 

Leu Tyr Arg Glu Thr Cys Asn Leu Asn Cys Met Val Thr lie Thr Thr 
245 250 255 

Ala Arg Ser Lys Tyr Pro Tyr His Phe Phe Ala Thr Ser Thr Gly Asp 
260 265 270 

Val Val Asp lie Ser Pro Phe Tyr Asn Gly Thr Asn Arg Asn Ala Ser 
275 280 285 

Tyr Phe Gly Glu Asn Ala Asp Lys Phe Phe lie Phe Pro Asn Tyr Thr 
290 295 300 

lie Val Ser Asp Phe Gly Arg Pro Asn Ser Ala Leu Glu Thr His Arg 
305 310 315 320 

Leu Val Ala Phe Leu Glu Arg Ala Asp Ser Val lie Ser Trp Asp lie 
325 330 335 

Gin Asp Glu Lys Asn Val Thr Cys Gin Leu Thr Phe Trp Glu Ala Ser 
340 345 350 



Glu Arg Thr lie Arg Ser Glu Ala Glu Asp Ser Tyr His Phe Ser Ser 
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Ala Lys Met Thr Ala Thr Phe Leu Ser Lys Lys Gin Glu Val Asn Met 
370 375 330 

Ser Asp Ser Ala Leu Asp Cys Val Arg Asp Glu Ala lie Asn Lys Leu 
385 390 395 400 

Gin Gin lie Phe Asn Thr Ser Tyr Asn Gin Thr Tyr Glu Lys Tyr Gly 
405 410 415 

Asn Val Ser Val Phe Glu Thr Thr Gly Gly Leu Val Val Phe Trp Gin 
420 425 430 

Gly lie Lys Gin Lys Ser Leu Val Glu Leu Glu Arg Leu Ala Asn Arg 
435 440 445 

Ser Ser Leu Asn Leu Thr His Asn Arg Thr Lys Arg Ser Thr Asp Gly 
450 455 460 

Asn Asn Ala Thr His Leu Ser Asn Met Glu Ser Val His Asn Leu Val 
465 470 475 480 

Tyr Ala Gin Leu Gin Phe Thr Tyr Asp Thr Leu Arg Gly Tyr lie Asn 
485 490 495 

Arg Ala Leu Ala Gin lie Ala Glu Ala Trp Cys Val Asp Gin Arg Arg 
500 505 510 

Thr Leu Glu Val Phe Lys Glu Leu Ser Lys lie Asn Pro Ser Ala lie 
515 520 525 

Leu Ser Ala lie Tyr Asn Lys Pro lie Ala Ala Arg Phe Met Gly Asp 
530 535 540 

Val Leu Gly Leu Ala Ser Cys Val Thr lie Asn Gin Thr Ser Val Lys 
545 550 555 560 

Val Leu Arg Asp Met Asn Val Lys Glu Ser Pro Gly Arg Cys Tyr Ser 
565 570 575 

Arg Pro Val Val lie Phe Asn Phe Ala Asn Ser Ser Tyr Val Gin Tyr 
580 585 590 

Gly Gin Leu Gly Glu Asp Asn Glu lie Leu Leu Gly Asn His Arg Thr 
595 600 605 

Glu Glu Cys Gin Leu Pro Ser Leu Lys lie Phe lie Ala Gly Asn Ser 
610 615 620 

Ala Tyr Glu Tyr Val Asp Tyr Leu Phe Lys Arg Met lie Asp Leu Ser 
625 630 635 640 

Ser lie Ser Thr Val Asp Ser Met lie Ala Leu Asp lie Asp Pro Leu 
645 650 655 

Glu Asn Thr Asp Phe Arg Val Leu Glu Leu Tyr Ser Gin Lys Glu Leu 
660 665 670 

Arg Ser Ser Asn Val Phe Asp Leu Glu Glu lie Met Arg Glu Phe Asn 
675 680 685 

Ser Tyr Lys Gin Arg Val Lys Tyr Val Glu Asp Lys Val Val Asp Pro 
690 695 700 

Leu Pro Pro Tyr Leu Lys Gly Leu Asp Asp Leu Met Ser Gly Leu Gly 
705 710 715 720 

Ala Ala Gly Lys Ala Val Gly Val Ala lie Gly Ala Val Gly Gly Ala 
725 730 735 

Val Ala Ser Val Val Glu Gly Val Ala Thr Phe Leu Lys Asn Pro Phe 
740 745 750 

Gly Ala Phe Thr lie lie Leu Val Ala lie Ala Val Val lie lie lie 
755 760 765 

Tyr Leu lie Tyr Thr Arg Gin Arg Arg Leu Cys Met Gin Pro Leu Gin 
770 775 730 
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Asn Leu Phe Pro Tyr Leu Val Ser Ala Asp Gly Thr Thr Val Thr Ser 
785 790 795 800 

Gly Asn Thr Lys Asp Thr Ser Leu Gin Ala Pro Pro Ser Tyr Glu Glu 
805 810 815 

Ser Val Tyr Asn Ser Gly Arg Lys Gly Pro Gly Pro Pro Ser Ser Asp 
820 825 830 

Ala Ser Thr Ala Ala Pro Pro Tyr Thr Asn Glu Gin Ala Tyr Gin Met 
835 840 845 

Leu Leu Ala Leu Val Arg Leu Asp Ala Glu Gin Arg Ala Gin Gin Asn 
850 855 860 

Gly Thr Asp Ser Leu Asp Gly Gin Thr Gly Thr Gin Asp Lys Gly Gin 
865 870 875 880 

Lys Pro Asn Leu Leu Asp Arg Leu Arg His Arg Lys Asn Gly Tyr Arg 
885 890 895 

His Leu Lys Asp Ser Asp Glu Glu Glu Asn Val 
900 905 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Met Ser Lys Met Val Val Leu Phe Leu Ala Val Phe Leu Met Asn Ser 
15 10 15 

Val Leu Met lie Tyr Cys Asp Pro Asp His Tyr lie Arg Ala Gly Tyr 

20 25 30 

Asn His Lys Tyr Pro Phe Arg lie Cys Ser lie Ala Lys Gly Thr Asp 
35 40 45 

Leu Met Arg Phe Asp Arg Asp lie Ser Cys Ser Pro Tyr Lys Ser Asn 
50 55 60 

Ala Lys Met Ser Glu Gly Phe Phe lie lie Tyr Lys Thr Asn lie Glu 
65 70 75 80 

Thr Tyr Thr Phe Pro Val Arg Thr Tyr Lys Lys Glu Leu Thr Phe Gin 
85 90 95 

Ser Ser Tyr Arg Asp Val Gly Val Val Tyr Phe Leu Asp Arg Thr Val 
100 105 110 

Met Gly Leu Ala Met Pro Val Tyr Glu Ala Asn Leu Val Asn Ser His 
115 120 125 

Ala Gin Cys Tyr Ser Ala Val Ala Met Lys Arg Pro Asp Gly Thr Val 
130 135 140 

Phe Ser Ala Phe His Glu Asp Asn Asn Lys Asn Asn Thr Leu Asn Leu 
145 150 155 160 

Phe Pro Leu Asn Phe Lys Ser lie Thr Asn Lys Arg Phe lie Thr Thr 
165 170 175 

Lys Glu Pro Tyr Phe Ala Arg Gly Pro Leu Trp Leu Tyr Ser Thr Ser 
180 185 190 

Thr Ser Leu Asn Cys lie Val Thr Glu Ala Thr Ala Lys Ala Lys Tyr 
195 200 205 

Pro Phe Ser Tyr Phe Ala Leu Thr Thr Gly Glu lie Val Glu Gly Ser 
210 215 220 



Pro Phe Phe Asn Gly Ser Asn Gly Lys His Phe Ala Glu Pro Leu Glu 
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Lys Leu Thr lie Leu Glu Asn Tyr Thr Met lie Glu Asp Leu Met Asn 
245 250 255 

Gly Met Asn Gly Ala Thr Thr Leu Val Arg Lys lie Ala Phe Leu Glu 
260 265 270 

Lys Ala Asp Thr Leu Phe Ser Trp Glu lie Lys Glu Glu Asn Glu Ser 
275 280 285 

Val Cys Met Leu Lys His Trp Thr Thr Val Thr His Gly Leu Arg Ala 
290 295 300 

Glu Thr Asp Glu Thr Tyr His Phe lie Ser Lys Glu Leu Thr Ala Ala 
305 310 315 320 

Phe Val Ala Pro Lys Glu Ser Leu Asn Leu Thr Asp Pro Lys Gin Thr 
325 330 335 

Cys lie Lys Asp Glu Phe Glu Lys lie lie Asn Glu Val Tyr Met Ser 
340 345 350 

Asp Tyr Asn Asp Thr Tyr Ser Met Asn Gly Ser Tyr Gin lie Phe Lys 
355 360 365 

Thr Thr Gly Asp Leu lie Leu lie Trp Gin Pro Leu Val Gin Lys Ser 
370 375 330 

Leu Met Phe Leu Glu Gin Gly Ser Glu Lys lie Arg Arg Arg Arg Asp 
385 390 395 400 

Val Val Asp Val Lys Ser Arg His Asp lie Leu Tyr Val Gin Leu Gin 
405 410 415 

Tyr Leu Tyr Asp Thr Leu Lys Asp Tyr lie Asn Asp Ala Leu Gly Asn 
420 425 430 

Leu Ala Glu Ser Trp Cys Leu Asp Gin Lys Arg Thr lie Thr Met Leu 
435 440 445 

His Glu Leu Ser Lys lie Ser Pro Ser Ser lie Val Ser Glu Val Tyr 
450 455 460 

Gly Arg Pro lie Ser Ala Gin Leu His Gly Asp Val Leu Ala lie Ser 
465 470 475 480 

Lys Cys lie Glu Val Asn Gin Ser Ser Val Gin Leu His Lys Ser Met 
485 490 495 

Arg Val Val Asp Ala Lys Gly Val Arg Ser Glu Thr Met Cys Tyr Asn 
500 505 510 

Arg Pro Leu Val Thr Phe Ser Phe Val Asn Ser Thr Pro Glu Val Val 
515 520 525 

Pro Gly Gin Leu Gly Leu Asp Asn Glu lie Leu Leu Gly Asp His Arg 
530 535 540 

Thr Glu Glu Cys Glu lie Pro Ser Thr Lys lie Phe Leu Ser Gly Asn 
545 550 555 560 

His Ala His Val Tyr Thr Asp Tyr Thr His Thr Asn Ser Thr Pro lie 
565 570 575 

Glu Asp lie Glu Val Leu Asp Ala Phe lie Arg Leu Lys lie Asp Pro 
580 585 590 

Leu Glu Asn Ala Asp Phe Lys Val Leu Asp Leu Tyr Ser Pro Asp Glu 
595 600 605 

Leu Ser Arg Ala Asn Val Phe Asp Leu Glu Asn lie Leu Arg Glu Tyr 
610 615 620 

Asn Ser Tyr Lys Ser Ala Leu Tyr Thr lie Glu Ala Lys lie Ala Thr 
625 630 635 640 

Asn Thr Pro Ser Tyr Val Asn Gly lie Asn Ser Phe Leu Gin Gly Leu 
645 650 655 
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Gly Ala lie Gly Thr Gly Leu Gly Ser Val lie Ser Val Thr Ala Gly 
660 665 670 

Ala Leu Gly Asp lie Val Gly Gly Val Val Ser Phe Leu Lys Asn Pro 



Phe Gly Gly Gly Leu Met Leu lie Leu Ala lie Val Val Val Val lie 
690 695 700 

lie lie Val Val Phe Val Arg Gin Arg His Val Leu Ser Lys Pro lie 



Asp Met Met Phe Pro Tyr Ala Thr Asn Pro Val Thr Thr Val Ser Ser 
725 730 735 

Val Thr Gly Thr Thr Val Val Lys Thr Pro Ser Val Lys Asp Val Asp 
740 745 750 

Gly Gly Thr Ser Val Ala Val Ser Glu Lys Glu Glu Gly Met Ala Asp 

755 760 765 

Val Ser Gly Gin Val Ser Asp Asp Glu Tyr Ser Gin Glu Ala Ala Leu 
770 775 730 

Lys Met Leu Lys Ala lie Lys Ser Leu Asp Glu Ser Tyr Arg Arg Lys 
785 790 795 800 

Pro Ser Ser Ser Glu Ser His Ala Ser Lys Pro Ser Leu lie Asp Arg 
805 810 815 

lie Arg Tyr Arg Gly Tyr Lys Ser Val Asn Val Glu Glu Ala 



(2) INFORMATION FOR SEQ ID NO : 2 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 68 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 1 : 

Met Phe Val Thr Ala Val Val Ser Val Ser Pro Ser Ser Phe Tyr Glu 
15 10 15 

Ser Leu Gin Val Glu Pro Thr Gin Ser Glu Asp lie Thr Arg Ser Ala 
20 25 30 

His Leu Gly Asp Gly Asp Glu lie Arg Glu Ala lie His Lys Ser Gin 
35 40 45 

Asp Ala Glu Thr Lys Pro Thr Phe Tyr Val Cys Pro Pro Pro Thr Gly 
50 55 60 

Ser Thr lie Val Arg Leu Glu Pro Thr Arg Thr Cys Pro Asp Tyr His 
65 70 75 80 

Leu Gly Lys Asn Phe Thr Glu Gly lie Ala Val Val Tyr Lys Glu Asn 
85 90 95 

lie Ala Ala Tyr Lys Phe Lys Ala Thr Val Tyr Tyr Lys Asp Val lie 
100 105 110 

Val Ser Thr Ala Trp Ala Gly Ser Ser Tyr Thr Gin lie Thr Asn Arg 
115 120 125 

Tyr Ala Asp Arg Val Pro lie Pro Val Ser Glu lie Thr Asp Thr lie 
130 135 140 

Asp Lys Phe Gly Lys Cys Ser Ser Lys Ala Thr Tyr Val Arg Asn Asn 
145 150 155 160 

His Lys Val Glu Ala Phe Asn Glu Asp Lys Asn Pro Gin Asp Met Pro 
165 170 175 



Leu lie Ala Ser Lys Tyr Asn Ser Val Gly Ser Lys Ala Trp His Thr 
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Thr Asn Asp Thr Tyr Met Val Ala Gly Thr Pro Gly Thr Tyr Arg Thr 
195 200 205 

Gly Thr Ser Val Asn Cys lie lie Glu Glu Val Glu Ala Arg Ser lie 
210 215 220 

Phe Pro Tyr Asp Ser Phe Gly Leu Ser Thr Gly Asp lie lie Tyr Met 
225 230 235 240 

Ser Pro Phe Phe Gly Leu Arg Asp Gly Ala Tyr Arg Glu His Ser Asn 
245 250 255 

Tyr Ala Met Asp Arg Phe His Gin Phe Glu Gly Tyr Arg Gin Arg Asp 
260 265 270 

Leu Asp Thr Arg Ala Leu Leu Glu Pro Ala Ala Arg Asn Phe Leu Val 
275 280 285 

Thr Pro His Leu Thr Val Gly Trp Asn Trp Lys Pro Lys Arg Thr Glu 
290 295 300 

Val Cys Ser Leu Val Lys Trp Arg Glu Val Glu Asp Val Val Arg Asp 
305 310 315 320 

Glu Tyr Ala His Asn Phe Arg Phe Thr Met Lys Thr Leu Ser Thr Thr 
325 330 335 

Phe lie Ser Glu Thr Asn Glu Phe Asn Leu Asn Gin lie His Leu Ser 
340 345 350 

Gin Cys Val Lys Glu Glu Ala Arg Ala lie lie Asn Arg lie Tyr Thr 
355 360 365 

Thr Arg Tyr Asn Ser Ser His Val Arg Thr Gly Asp lie Gin Thr Tyr 
370 375 330 

Leu Ala Arg Gly Gly Phe Val Val Val Phe Gin Pro Leu Leu Ser Asn 
385 390 395 400 

Ser Leu Ala Arg Leu Tyr Leu Gin Glu Leu Val Arg Glu Asn Thr Asn 
405 410 415 

His Ser Pro Gin Lys His Pro Thr Arg Asn Thr Arg Ser Arg Arg Ser 
420 425 430 

Val Pro Val Glu Leu Arg Ala Asn Arg Thr lie Thr Thr Thr Ser Ser 
435 440 445 

Val Glu Phe Ala Met Leu Gin Phe Thr Tyr Asp His lie Gin Glu His 
450 455 460 

Val Asn Glu Met Leu Ala Arg lie Ser Ser Ser Trp Cys Gin Leu Gin 
465 470 475 480 

Asn Arg Glu Arg Ala Leu Trp Ser Gly Leu Phe Pro lie Asn Pro Ser 
485 490 495 

Ala Leu Ala Ser Thr lie Leu Asp Gin Arg Val Lys Ala Arg lie Leu 
500 505 510 

Gly Asp Val lie Ser Val Ser Asn Cys Pro Glu Leu Gly Ser Asp Thr 
515 520 525 

Arg lie lie Leu Gin Asn Ser Met Arg Val Ser Gly Ser Thr Thr Arg 
530 535 540 

Cys Tyr Ser Arg Pro Leu lie Ser lie Val Ser Leu Asn Gly Ser Gly 
545 550 555 560 

Thr Val Glu Gly Gin Leu Gly Thr Asp Asn Glu Leu lie Met Ser Arg 
565 570 575 

Asp Leu Leu Glu Pro Cys Val Ala Asn His Lys Arg Tyr Phe Leu Phe 
580 585 590 

Gly His His Tyr Val Tyr Tyr Glu Asp Tyr Arg Tyr Val Arg Glu lie 
595 600 605 
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Ala Val His Asp Val Gly Met lie Ser Thr Tyr Val Asp Leu Asn Leu 
610 615 620 

Thr Leu Leu Lys Asp Arg Glu Phe Met Pro Leu Gin Val Tyr Thr Arg 
625 630 635 640 

Asp Glu Leu Arg Asp Thr Gly Leu Leu Asp Tyr Ser Glu lie Gin Arg 
645 650 655 

Arg Asn Gin Met His Ser Leu Arg Phe Tyr Asp lie Asp Lys Val Val 
660 665 670 

Gin Tyr Asp Ser Gly Thr Ala lie Met Gin Gly Met Ala Gin Phe Phe 
675 680 685 

Gin Gly Leu Gly Thr Ala Gly Gin Ala Val Gly His Val Val Leu Gly 
690 695 700 

Ala Thr Gly Ala Leu Leu Ser Thr Val His Gly Phe Thr Thr Phe Leu 
705 710 715 720 

Ser Asn Pro Phe Gly Ala Leu Ala Val Gly Leu Leu Val Leu Ala Gly 
725 730 735 

Leu Val Ala Ala Phe Phe Ala Tyr Arg Tyr Val Leu Lys Leu Lys Thr 
740 745 750 

Ser Pro Met Lys Ala Leu Tyr Pro Leu Thr Thr Lys Gly Leu Lys Gin 
755 760 765 

Leu Pro Glu Gly Met Asp Pro Phe Ala Glu Lys Pro Asn Ala Thr Asp 
770 775 730 

Thr Pro lie Glu Glu lie Gly Asp Ser Gin Asn Thr Glu Pro Ser Val 
785 790 795 800 

Asn Ser Gly Phe Asp Pro Asp Lys Phe Arg Glu Ala Gin Glu Met lie 
805 810 815 

Lys Tyr Met Thr Leu Val Ser Ala Ala Glu Arg Gin Glu Ser Lys Ala 

820 825 830 

Arg Lys Lys Asn Lys Thr Ser Ala Leu Leu Thr Ser Arg Leu Thr Gly 
835 840 845 

Leu Ala Leu Arg Asn Arg Arg Gly Tyr Ser Arg Val Arg Thr Glu Asn 
850 855 860 

Val Thr Gly Val 

865 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 03 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Arg Gin Gly Ala Ala Arg Gly Cys Arg Trp Phe Val Val Trp Ala 
15 10 15 

Leu Leu Gly Leu Thr Leu Gly Val Leu Val Ala Ser Ala Ala Pro Ser 
20 25 30 

Ser Pro Gly Thr Pro Gly Val Ala Ala Ala Thr Gin Ala Ala Asn Gly 
35 40 45 

Gly Pro Ala Thr Pro Ala Pro Pro Ala Pro Gly Pro Ala Pro Thr Gly 
50 55 60 

Asp Thr Lys Pro Lys Lys Asn Lys Lys Pro Lys Asn Pro Pro Pro Pro 
65 70 75 80 



Arg Pro Ala Gly Asp Asn Ala Thr Val Ala Ala Gly His Ala Thr Leu 
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Arg Glu His Leu Arg Asp lie Lys Ala Glu Asn Thr Asp Ala Asn Phe 
100 105 110 

Tyr Val Cys Pro Pro Pro Thr Gly Ala Thr Val Val Gin Phe Glu Gin 
115 120 125 

Pro Arg Arg Cys Pro Thr Arg Pro Glu Gly Gin Asn Tyr Thr Glu Gly 
130 135 140 

lie Ala Val Val Phe Lys Glu Asn lie Ala Pro Tyr Lys Phe Lys Ala 
145 150 155 160 

Thr Met Tyr Tyr Lys Asp Val Thr Val Ser Gin Val Trp Phe Gly His 
165 170 175 

Arg Tyr Ser Gin Phe Met Gly lie Phe Glu Asp Arg Ala Pro Val Pro 
180 185 190 

Phe Glu Glu Val lie Asp Lys lie Asn Ala Lys Gly Val Cys Arg Ser 
195 200 205 

Thr Ala Lys Tyr Val Arg Asn Asn Leu Glu Thr Thr Ala Phe His Arg 
210 215 220 

Asp Asp His Glu Thr Asp Met Glu Leu Lys Pro Ala Asn Ala Ala Thr 
225 230 235 240 

Arg Thr Ser Arg Gly Trp His Thr Thr Asp Leu Lys Tyr Asn Pro Ser 
245 250 255 

Arg Val Glu Ala Phe His Arg Tyr Gly Thr Thr Val Asn Cys lie Val 
260 265 270 

Glu Glu Val Asp Ala Arg Ser Val Tyr Pro Tyr Asp Glu Phe Val Leu 
275 280 285 

Ala Thr Gly Asp Phe Val Tyr Met Ser Pro Phe Tyr Gly Tyr Arg Glu 
290 295 300 

Gly Ser His Thr Glu His Thr Ser Tyr Ala Ala Asp Arg Phe Lys Gin 
305 310 315 320 

Val Asp Gly Phe Tyr Ala Arg Asp Leu Thr Thr Lys Ala Arg Ala Thr 
325 330 335 

Ala Pro Thr Thr Arg Asn Leu Leu Thr Thr Pro Lys Phe Thr Val Ala 
340 345 350 

Trp Asp Trp Val Pro Lys Arg Pro Ser Val Cys Thr Met Thr Lys Trp 
355 360 365 

Gin Glu Val Asp Glu Met Leu Arg Ser Glu Tyr Gly Gly Ser Phe Arg 
370 375 330 

Phe Ser Ser Asp Ala lie Ser Thr Thr Phe Thr Thr Asn Leu Thr Glu 
385 390 395 400 

Tyr Pro Leu Ser Arg Val Asp Leu Gly Asp Cys lie Gly Lys Asp Ala 
405 410 415 

Arg Asp Ala Met Asp Arg lie Phe Ala Arg Arg Tyr Asn Ala Thr His 
420 425 430 

lie Lys Val Gly Gin Pro Gin Tyr Tyr Leu Ala Asn Gly Gly Phe Leu 
435 440 445 

lie Ala Tyr Gin Pro Leu Leu Ser Asn Thr Leu Ala Glu Leu Tyr Val 
450 455 460 

Arg Glu His Leu Arg Glu Gin Ser Arg Lys Pro Pro Asn Pro Thr Pro 
465 470 475 480 

Pro Pro Pro Gly Ala Ser Ala Asn Ala Ser Val Glu Arg lie Lys Thr 
485 490 495 

Thr Ser Ser lie Glu Phe Ala Arg Leu Gin Phe Thr Tyr Asn His lie 
500 505 510 
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Gin Arg His Val Asn Asp Met Leu Gly Arg Val Ala lie Ala Trp Cys 
515 520 525 

Glu Leu Gin Asn His Glu Leu Thr Leu Trp Asn Glu Ala Arg Lys Leu 
530 535 540 

Asn Pro Asn Ala lie Ala Ser Ala Thr Val Gly Arg Arg Val Ser Ala 
545 550 555 560 

Arg Met Leu Gly Asp Val Met Ala Val Ser Thr Cys Val Pro Val Ala 
565 570 575 

Ala Asp Asn Val lie Val Gin Asn Ser Met Arg lie Ser Ser Arg Pro 
580 585 590 

Gly Ala Cys Tyr Ser Arg Pro Leu Val Ser Phe Arg Tyr Glu Asp Gin 
595 600 605 

Gly Pro Leu Val Glu Gly Gin Val Gly Glu Asn Asn Glu Leu Arg Leu 
610 615 620 

Thr Arg Asp Ala lie Glu Pro Cys Thr Val Gly His Arg Arg Tyr Phe 
625 630 635 640 

Thr Phe Gly Gly Gly Tyr Val Tyr Phe Glu Glu Tyr Ala Tyr Ser His 
645 650 655 

Gin Leu Ser Arg Ala Asp lie Thr Thr Val Ser Thr Phe lie Asp Leu 
660 665 670 

Asn lie Thr Met Leu Glu Asp His Glu Phe Val Pro Leu Glu Val Tyr 
675 680 685 

Thr Arg His Glu lie Lys Asp Ser Gly Leu Leu Asp Tyr Thr Glu Val 
690 695 700 

Gin Arg Arg Asn Gin Leu His Asp Leu Arg Phe Ala Asp lie Asp Thr 
705 710 715 720 

Val lie His Ala Asp Ala Asn Ala Ala Met Phe Ala Gly Leu Gly Ala 
725 730 735 

Phe Phe Glu Gly Met Gly Asp Leu Gly Arg Ala Val Gly Lys Val Val 
740 745 750 

Met Gly lie Val Gly Gly Val Val Ser Ala Val Ser Gly Val Ser Ser 
755 760 765 

Phe Met Ser Asn Pro Phe Gly Ala Leu Ala Val Gly Leu Leu Val Leu 
770 775 730 

Ala Gly Leu Ala Ala Ala Phe Phe Ala Phe Arg Tyr Val Met Arg Leu 
785 790 795 800 

Gin Ser Asn Pro Met Lys Ala Leu Tyr Pro Leu Thr Thr Lys Glu Leu 
805 810 815 

Lys Asn Pro Thr Asn Pro Asp Ala Ser Gly Glu Gly Glu Glu Gly Gly 
820 825 830 

Asp Phe Asp Glu Ala Lys Leu Ala Glu Ala Arg Glu Met lie Arg Tyr 
835 840 845 

Met Ala Leu Val Ser Ala Met Glu Arg Thr Glu His Lys Ala Lys Lys 
850 855 860 

Lys Gly Thr Ser Ala Leu Leu Ser Ala Lys Val Thr Asp Met Val Met 
865 870 875 880 

Arg Lys Arg Arg Asn Thr Asn Tyr Thr Gin Val Pro Asn Lys Asp Gly 

885 890 895 

Asp Ala Asp Glu Asp Asp Leu 
900 



(2) INFORMATION FOR SEQ ID NO: 23: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 85 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Met Arg Pro Arg Gly Thr Pro Pro Ser Phe Leu Pro Leu Pro Val Leu 
15 10 15 

Leu Ala Leu Ala Val lie Ala Ala Ala Gly Arg Ala Ala Pro Ala Ala 
20 25 30 

Ala Ala Ala Pro Thr Ala Asp Pro Ala Ala Thr Pro Ala Leu Pro Glu 
35 40 45 

Asp Glu Glu Val Pro Asp Glu Asp Gly Glu Gly Val Ala Thr Pro Ala 
50 55 60 

Pro Ala Ala Asn Ala Ser Val Glu Ala Gly Arg Ala Thr Leu Arg Glu 
65 70 75 80 

Asp Leu Arg Glu lie Lys Ala Arg Asp Gly Asp Ala Thr Phe Tyr Val 
85 90 95 

Cys Pro Pro Pro Thr Gly Ala Thr Val Val Gin Phe Glu Gin Pro Arg 
100 105 110 

Pro Cys Pro Arg Ala Pro Asp Gly Gin Asn Tyr Thr Glu Gly lie Ala 
115 120 125 

Val Val Phe Lys Glu Asn lie Ala Pro Tyr Lys Phe Lys Ala Thr Met 
130 135 140 

Tyr Tyr Lys Asp Val Thr Val Ser Gin Val Trp Phe Gly His Arg Tyr 
145 150 155 160 

Ser Gin Phe Met Gly lie Phe Glu Asp Arg Ala Pro Val Pro Phe Glu 
165 170 175 

Glu Val Met Asp Lys lie Asn Ala Lys Gly Val Cys Arg Ser Thr Ala 
180 185 190 

Lys Tyr Val Arg Asn Asn Met Glu Ser Thr Ala Phe His Arg Asp Asp 
195 200 205 

His Glu Ser Asp Met Ala Leu Lys Pro Ala Lys Ala Ala Thr Arg Thr 
210 215 220 

Ser Arg Gly Trp His Thr Thr Asp Leu Lys Tyr Asn Pro Ala Arg Val 
225 230 235 240 

Glu Ala Phe His Arg Tyr Gly Thr Thr Val Asn Cys lie Val Glu Glu 
245 250 255 

Val Glu Ala Arg Ser Val Tyr Pro Tyr Asp Glu Phe Val Leu Ala Thr 
260 265 270 

Gly Asp Phe Val Tyr Met Ser Pro Phe Tyr Gly Tyr Arg Asp Gly Ser 
275 280 285 

His Gly Glu His Thr Ala Tyr Ala Ala Asp Arg Phe Arg Gin Val Asp 
290 295 300 

Gly Tyr Tyr Glu Arg Asp Leu Ser Thr Gly Arg Arg Ala Ala Ala Pro 
305 310 315 320 

Val Thr Arg Asn Leu Leu Thr Thr Pro Lys Phe Thr Val Gly Trp Asp 
325 330 335 

Trp Ala Pro Lys Arg Pro Ser Val Cys Thr Leu Thr Lys Trp Arg Glu 
340 345 350 

Val Asp Glu Met Leu Arg Ala Glu Tyr Gly Pro Ser Phe Arg Phe Ser 
355 360 365 



Ser Ala Ala Leu Ser Thr Thr Phe Thr Ala Asn Arg Thr Glu Tyr Ala 
370 375 330 
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Leu Ser Arg Val Asp Leu Ala Asp Cys Val Gly Arg Glu Ala Arg Glu 
385 390 395 400 

Ala Val Asp Arg lie Phe Leu Arg Arg Tyr Asn Gly Thr His Val Lys 
405 410 415 

Val Gly Gin Val Gin Tyr Tyr Leu Ala Thr Gly Gly Phe Leu lie Ala 
420 425 430 

Tyr Gin Pro Leu Leu Ser Asn Ala Leu Val Glu Leu Tyr Val Arg Glu 
435 440 445 

Leu Val Arg Glu Gin Thr Arg Arg Pro Ala Gly Gly Asp Pro Gly Glu 
450 455 460 

Ala Ala Thr Pro Gly Pro Ser Val Asp Pro Pro Ser Val Glu Arg lie 
465 470 475 480 

Lys Thr Thr Ser Ser Val Glu Phe Ala Arg Leu Gin Phe Thr Tyr Asp 
485 490 495 

His lie Gin Arg His Val Asn Asp Met Leu Gly Arg lie Ala Thr Ala 
500 505 510 

Trp Cys Glu Leu Gin Asn Arg Glu Leu Thr Leu Trp Asn Glu Ala Arg 
515 520 525 

Arg Leu Asn Pro Gly Ala lie Ala Ser Ala Thr Val Gly Arg Arg Val 
530 535 540 

Ser Ala Arg Met Leu Gly Asp Val Met Ala Val Ser Thr Cys Val Pro 
545 550 555 560 

Val Ala Pro Asp Asn Val lie Met Gin Asn Ser lie Gly Val Ala Ala 
565 570 575 

Arg Pro Gly Thr Cys Tyr Ser Arg Pro Leu Val Ser Phe Arg Tyr Glu 
580 585 590 

Ala Asp Gly Pro Leu Val Glu Gly Gin Leu Gly Glu Asp Asn Glu lie 
595 600 605 

Arg Leu Glu Arg Asp Ala Leu Glu Pro Cys Thr Val Gly His Arg Arg 
610 615 620 

Tyr Phe Thr Phe Gly Ala Gly Tyr Val Tyr Phe Glu Glu Tyr Ala Tyr 
625 630 635 640 

Ser His Gin Leu Gly Arg Ala Asp Val Thr Thr Val Ser Thr Phe lie 
645 650 655 

Asn Leu Asn Leu Thr Met Leu Glu Asp His Glu Phe Val Pro Leu Glu 
660 665 670 

Val Tyr Thr Arg Gin Glu lie Lys Asp Ser Gly Leu Leu Asp Tyr Thr 
675 680 685 

Glu Val Gin Arg Arg Asn Gin Leu His Ala Leu Arg Phe Ala Asp lie 
690 695 700 

Asp Thr Val lie Lys Ala Asp Ala His Ala Ala Leu Phe Ala Gly Leu 
705 710 715 720 

Tyr Ser Phe Phe Glu Gly Leu Gly Asp Val Gly Arg Ala Val Gly Lys 
725 730 735 

Val Val Met Gly lie Val Gly Gly Val Val Ser Ala Val Ser Gly Val 
740 745 750 

Ser Ser Phe Leu Ser Asn Pro Phe Gly Ala Leu Ala Val Gly Leu Leu 
755 760 765 

Val Leu Ala Gly Leu Ala Ala Ala Phe Phe Ala Phe Arg Tyr Val Met 
770 775 730 

Arg Leu Gin Arg Asn Pro Met Lys Ala Leu Tyr Pro Leu Thr Thr Lys 

785 790 795 800 

Glu Leu Lys Ser Asp Gly Ala Pro Leu Ala Gly Gly Gly Glu Asp Gly 
805 810 815 
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Ala Glu Asp Phe Asp Glu Ala Lys Leu Ala Gin Ala Arg Glu Met lie 
820 825 830 

Arg Tyr Met Ala Leu Val Ser Ala Met Glu Arg Thr Glu His Lys Ala 
835 840 845 

Arg Lys Lys Gly Thr Ser Ala Leu Leu Ser Ala Lys Val Thr Asp Ala 
850 855 860 

Val Met Arg Lys Arg Ala Arg Pro Arg Tyr Ser Pro Leu Arg Asp Thr 
865 870 875 880 

Asp Glu Glu Glu Leu 
885 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
GCTGTTCAGA TTTGACTTAG AYMANMCNTG YCC 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GTGTACAAGA AGAACATCGT GCCNTAYATN TTYAA 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
GTGTACAAGA AGAACATCGT GCC 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
AACATGTCTA CAATCTCACA RTTNACNGTN GT 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



AACATGTCTA CAATCTCACA 



20 
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(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
AATAACCTCT TTACGGCCCA AATTCARTWY GCNTAYGA 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CCAACGAGTG TGATGTCAGC CATTTAYGGN AARCCNGT 



(2) INFORMATION FOR SEQ ID NO : 3 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 1 : 
CCAACGAGTG TGATGTCAGC C 



(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
TGCTACTCGC GACCTCTAGT CACCTTYAAR TTYRTNAA 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
TGCTACTCGC GACCTCTAGT CACC 



(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



ACCGGAGTAC AGTTCCACTG TYTTRAARTC DATRTT 



36 
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(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
TGTCACCTTG ACATGAGGCC A 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
TTTGACCTGG AGACTATGTT YMGNGARTAY AA 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GCTCTGGGTG TAGTAGTTRT AYTCYCTRAA CAT 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
TCTCGGAACA TGCTCTCCAG RTCRAAMACR TT 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
ACCTTCATCA AAAATCCCTT NGGNGGNATG YT 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
T GGACT T AC A GGACTC GAAC NACNGTNAAY TG 



(2) INFORMATION FOR SEQ ID NO : 4 1 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 1 : 
AGACCCGTGC CACTCTATGA RATHAGYCAY ATGGA 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
AGACCCGTGC CACTCTATGA 



(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GTTCACAACA ATCTTCATNG ARCTRAARCA 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
GTTCACAACA ATCTTCAT 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
GTCAAC GGAG TAGARAAYAC NTTYACNGA 



(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
ACTGGC TGGC TAAAGTACCT TTGAATRTTR TCNGT 



(2) INFORMATION FOR SEQ ID NO: 47: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
ACTGGC TGGC TAAAGTACCT TTG 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
TGCTGCTTCT GTCATACCGC G 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
TATTTGTTTG TGATTGCTGC T 



(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GCGGTATGAC AGAAGCAGCA A 



(2) INFORMATION FOR SEQ ID NO : 5 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 1 : 
AACAAATATG AGATCC CC AG G 



(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
T CAT CC CG AT CGGTGAACGT A 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
TTGTCAGTTA GACCTTCGAC G 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
CCCGTCGAAG GTCTAACTGA C 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
AGCCAACCAG TACTGTACTC T 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 
T GAT GGCGGA CTCTGTCAAG C 



(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 1 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
GTTCATACTT GTTGGTGATG G 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 
GGGCTTGACA GAGTCCGCCA T 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



ACAAGTATGA ACT CCC GAGA C 



21 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
ACCCCGTTGA CATTTACCTT C 2 1 



(2) INFORMATION FOR SEQ ID NO : 6 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 1 : 
TCGTCTCTGT CAGTAAATGT G 2 1 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
CCACAGTATT CCTCCAACCA G 2 1 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
GGTACTTTAG CCAGCCGGTC A 2 1 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Tyr Arg Lys lie Ala Thr Ser Val Thr Val Tyr Arg Gly 
15 10 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
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He Tyr Ala Glu Pro Gly Trp Phe Pro Gly He Tyr Arg Val Arg 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Arg Tyr Phe Ser Gin Pro 
1 5 



(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Val Thr Val Tyr Arg Gly 
1 5 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Ala He Thr Asn Lys Tyr Glu 
1 5 



(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Ser His Met Asp Ser Thr Tyr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Val Glu Asn Thr Phe Thr Asp 
1 5 



(2) INFORMATION FOR SEQ ID NO : 7 1 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 1 : 

Thr Val Phe Leu Gin Pro Val 
1 5 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Thr Asp Asn lie Gin Arg Tyr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Arg Gly Met Thr Glu Ala Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Pro Val Leu Tyr Ser Glu Pro 
1 5 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Arg Gly Leu Thr Glu Ser Ala 
1 5 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Pro Val lie Tyr Ala Glu Pro 
1 5 



(2) INFORMATION FOR SEQ ID NO: 77: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
GCCTTTGAGA ATTCYAARTA YATHAAR 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
GGGTTTGAGA ATTCYAARTA YATHAAR 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Thr Ala Ala Ala Ala Gly Thr Ala Cys Ala Gly Cys Thr Cys Cys Thr 
15 10 15 

Gly Cys Cys Cys Gly Ala Ala Asn Ala Cys Arg Thr Thr Asn Ala Cys 
20 25 30 



Arg Cys Ala 
35 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
TGTGGAAACG GGAGCGTACA C 



(2) INFORMATION FOR SEQ ID NO : 8 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 1 : 
T C AG AC AAG A GTACGTGTCG G 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
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TACAGGTCGA CCGTAGATGG C 



(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
CGCCATTTCC GTGACCGAGT G 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
T GAT GAAGT A GTGTTC GC AG G 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
GATGCCACCC AGGTCCGCCA C 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 
GTGGCGGACC TGGGTGGCAT C 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 
CGTAGATCGC AGGGCACCTC C 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 



GTCTCTCCCG C GAAT ACT TC T 



21 
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(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 
GAGGGCCTGC TGGAGGACGT G 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 
CGGTGGAGAA GCCGCAGGAT G 



(2) INFORMATION FOR SEQ ID NO : 9 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3612 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 2.. 4 06 

(D) OTHER INFORMATION: /function= 

"Caps id/ Maturation /Transport gene" 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 3 9 3.. 2 92 7 

(D) OTHER INFORMATION: /function= "Glycoprotein B gene" 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 3 057.. 36 11 

(D) OTHER INFORMATION: /product= "DNA Polymerase" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 1 : 

TGGGGGCATG TTTCCCATTC AAAAG AT GAT GGTATCAGAG ATGATCTGGC C C AGC AT AG A 6 0 

GCGGAAGGAC TGGATAGAGC CCAACTTCAA CCAGTTCTAT AGCTTTGAGA ATCAAGACAT 12 0 

AAAC CATC TG CAAAAGAGAG CTTGGGAATA TATCAGAGAG CTGGTATTAT CGGTTTCTCT 18 0 

G AAC AAC AG A ACTTGGGAGA GGGAGCTAAA AATACTTCTC ACGCCTCAGG GCTCACCGGG 24 0 

GTTTGAGGAA CCGAAACCCG CAGGACTCAC AACGGGGCTG TACCTAACAT T TGAG AT AT C 30 0 

TGCGCCCTTG GTGTTGGTGG ATAAAAAATA TGGCTGGATA TTTAAAGACC TGTACGCCCT 36 0 

TCTGTACCAC CACCTGCAAC TGAGCAACCA CAATGACTCC CAGGTCTAGA TTGGCCACCC 42 0 

TGGGGACTGT CATCCTGTTG GTCTGCTTTT GCGCAGGCGC GGCGCACTCG AGGGGTGACA 48 0 

CCTTTCAGAC GTCCAGTTCC CCCACACCCC CAGGATCTTC CTCTAAGGCC CCCACCAAAC 54 0 

C TGGTG AGGA AGCATCTGGT CCTAAGAGTG TGGACTTTTA CCAGTTCAGA GTGTGTAGTG 60 0 

CATCGATCAC CGGGGAGCTT TTTCGGTTCA ACCTGGAGCA G AC GT GC CCA G AC AC C AAAG 66 0 

ACAAGTACCA CCAAGAAGGA ATTTTACTGG TGTACAAAAA AAAC AT AGT G CCTCATATCT 72 0 



TTAAGGTGCG GCGCTATAGG AAAATTGCCA CCTCTGTCAC GGTCTACAGG GGCTTGACAG 78 0 
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AGTCCGCCAT CACCAACAAG TATGAACTCC CGAGACCCGT GCCACTCTAT GAG AT AAGC C 84 0 

AC AT GG AC AG CACCTATCAG TGCTTTAGTT C CAT G AAGGT AAATGTCAAC GGGGTAGAAA 90 0 

ACACATTTAC T GAC AG AG AC GATGTTAACA CCACAGTATT CCTCCAACCA GTAGAGGGGC 96 0 

T T AC GG AT AA CATTCAAAGG TACTTTAGCC AGCCGGTCAT CTACGCGGAA CCCGGCTGGT 102 0 

TTCCCGGCAT ATACAGAGTT AGGACCACYG TCAATTGCGA GATAGTGGAC ATGATAGCCA 108 0 

GGTCTGCTGA ACCATACAAT TACTTTGTCA CGTCACTGGG TGACACGGTG GAAGTCTCCC 114 0 

CTTTTTGCTA TAACGAATCC TCATGCAGCA CAACCCCCAG CAACAAAAAT GGCCTTAGCG 120 0 

TCCAAGTAGT TCTCAACCAC ACT GT GGT C A CGTACTCTGA C AG AGGAAC C AGTCCCACTC 126 0 

C CCAAAAC AG GATCTTTGTG GAAACGGGAG CGTACACGCT TTCGTGGGCC TCCGAGAGCA 132 0 

AGACCACGGC CGTGTGTCCG CTGGCACTGT GGAAAAC CTT CCCGCGCTCC ATCCAGACTA 138 0 

CCCACGAGGA CAGCTTCCAC TTTGTGGCCA AC GAG AT C AC GGCCACCTTC ACGGCTCCTC 144 0 

TAACGCCAGT GGCCAACTTT ACC GAC AC GT ACTCTTGTCT GACCTCGGAT ATCAACACCA 150 0 

CGCTTAACGC CAGCAAGGCC AAACTGGCGA GCACTCACGT CCCTAACGGG ACGGTCCAGT 156 0 

ACTTCCACAC AACAGGCGGA CTCTATTTGG TCTGGCAGCC CATGTCCGCG ATTAACCTGA 162 0 

CTCACGCTCA GGGCGACAGC GGGAACCCCA CGTCATCGCC GCCCCCCTCC GCATCCCCCA 168 0 

TGACCACCTC TGCCAGCCGC AGAAAGAGAC GGTCAGCCAG TACCGCTGCT GCCGGCGGCG 174 0 

GGGGGTCCAC GGACAACCTG T CT T AC AC GC AGCTGCAGTT TGCCTACGAC AAACTGCGGG 180 0 

ATGGCATTAA TCAGGTGTTA GAAGAACTCT CCAGGGCATG GTGTC GC GAG CAGGTCAGGG 186 0 

ACAACCTAAT GTGGTACGAG CTCAGTAAAA TCAACCCCAC CAGCGTTATG AC AGC CATC T 192 0 

ACGGTC GACC TGTATCCGCC AAGTTCGTAG GAG AC GC CAT TTCCGTGACC GAGTGCATTA 198 0 

ACGTGGACCA GAGCTCCGTA AACATCCACA AGAGC CT C AG AACCAATAGT AAGGACGTGT 2 04 0 

GTTACGCGCG CCCCCTGGTG ACGTTTAAGT TTTTGAACAG TTCCAACCTA TTCACCGGCC 2 10 0 

AGCTGGGCGC GCGCAATGAG AT AAT AC T G A CCAACAACCA GGTGGAAACC TGCAAAGACA 2 16 0 

CCTGCGAACA C T ACTT CATC ACCCGCAACG AGACTCTGGT GTATAAGGAC T AC GC GT AC C 2 22 0 

TGCGCACTAT AAAC AC C ACT GAC AT AT CCA CCCTGAACAC TTTTATCGCC CTGAATCTAT 2280 

C CTT T ATT C A AAACATAGAC TTCAAGGCCA TCGAGCTGTA CAGCAGTGCA GAG AAAC GAC 2 34 0 

TCGCGAGTAG CGTGTTTGAC CTGGAGACGA TGTTCAGGGA GT AC AAC T AC T AC AC AC AT C 2 40 0 

GTCTCGCGGG TTTGCGCGAG GATCTGGACA ACACCATAGA TATGAACAAG GAGCGCTTCG 2 46 0 

TAAGGGACTT GTC GGAGAT A GTGGCGGACC TGGGTGGCAT CGGAAAAACG GTKGTGAACG 2 52 0 

TGGCCAGC AG C GT GGT C ACT CTATGTGGCT CATTGGTTAC CGGATTCATA AATTTTATTA 2 58 0 

AACACCCCCT AGGTGGCATG C TG AT GAT C A T TAT C GT TAT AGCAATCATC C TG AT C ATT T 2 64 0 

T TAT GC TC AG TCGCCGCACC AAT AC CAT AG CCCAGGCGCC GGT GAAG AT G ATCTACCCCG 2 70 0 

ACGTAGATCG CAGGGCACCT CCTAGCGGCG GAGCCCCAAC ACGGGAGGAA ATCAAAAACA 2 76 0 

TCCTGCTGGG AAT GC ACC AG CTACAACAAG AGG AG AG GC A GAAGGCGGAT GAT YT GAAAA 2 82 0 

AAAGTACACC C TC GGT GT TT CAGCGTACCG CAAACGGCCT TCGTCAGCGT CTGAGAGGAT 2 88 0 

AT AAAC CT CT GAC TC AAT CG CTAGACATCA GTCYGGAAAC GGGGGAGTGA CAGTGGATTC 2 94 0 

GAGGTTATTG T TT GAT GT AA ATT T AGGAAA CACGGCCCGC CTCTGAAGCA C C AC AT AC AG 3 00 0 

ACTGCAGTTA TCAACCCTAC TCGTTGCACA CAGACACAAA T T ACC GT CC G CAGATCATGG 3 06 0 

ATTTTTTCAA TCCATTTATC GACCCAACTC GCGGAGGCCC GAGAAACACT GTGAGGCAAC 3 12 0 

CCACGCCGTC ACAGTCGCCA ACTGTCCCCT CGGAGACAAG AGTATGCAGG CTTATACCGG 3 18 0 
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CCTGTTTCCA AACCCCGGGG CGACCCGGCG TGGTTGCCGT GGACACCACA TTTCCACCCA 3 24 0 

CCTACTTCCA GGGCCC C AAG CGGGGAGAAG TATTCGCGGG AGAGACT GGG TCTATCTGGA 3 30 0 

AAACAAGGCG CGGACAGGCA CGCAATGCTC CTATGTCGCA C CT CAT ATT C C AC GT AT AC G 3 36 0 

ACATCGTGGA GACCACCTAC ACGGCCGACC GCTGCGAGGA C GT GC C ATT T AGCTTCCAGA 3 42 0 

CTGATATCAT TCCCAGCGGC ACCGTCCTCA AGCTGCTCGG CAGAACACTA GATGGCGCCA 3 48 0 

GTGTCTGCGT GAACGTTTTC AGGCAGCGCT GCTACTTCTA CACACTAGCA CCCCAGGGGG 3 54 0 

TAAACCTGAC CCACGTCCTC CAGCAGGCCC TCCAGGCTGG CTTCGGTCGC GCATCCTGCG 3 60 0 

GCTTCTCCAC CG 3 612 

(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3056 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

TGGGGGCATG TTTCCCATTC AAAAG AT GAT GGTATCAGAG ATGATCTGGC CCAGCATAGA 6 0 

GCGGAAGGAC TGGATAGAGC CCAACTTCAA CCAGTTCTAT AGCTTTGAGA ATCAAGACAT 12 0 

AAAC CATC TG C AAAAG AG AG CTTGGGAATA TATCAGAGAG CTGGTATTAT CGGTTTCTCT 18 0 

GAACAACAGA ACTTGGGAGA GGGAGCTAAA AATACTTCTC ACGCCTCAGG GCTCACCGGG 24 0 

GTTTGAGGAA CCGAAACCCG CAGGACTCAC AACGGGGCTG TACCTAACAT TTGAGATATC 30 0 

TGCGCCCTTG GTGTTGGTGG ATAAAAAATA TGGCTGGATA TTTAAAGACC T GT AC GC CC T 36 0 

TCTGTACCAC CACCTGCAAC TGAGCAACCA CAATGACTCC CAGGTCTAGA TTGGCCACCC 42 0 

T GGGGACT GT CATCCTGTTG GTCTGCTTTT GCGCAGGCGC GGCGCACTCG AGGGGTGACA 48 0 

CCTTTCAGAC GTCCAGTT CC CCCACACCCC CAGGATCTTC CTCTAAGGCC CCCACCAAAC 54 0 

C TGGTG AGGA AGCATCTGGT CCTAAGAGTG TGGACTTTTA CCAGTTCAGA GTGTGTAGTG 60 0 

CATCGATCAC CGGGGAGCTT TTTCGGTTCA ACCTGGAGCA G AC GT GC CCA G AC AC C AAAG 66 0 

ACAAGTACCA C C AAG AAG G A ATTTTACTGG TGTACAAAAA AAACATAGTG CCTCATATCT 72 0 

TTAAGGTGCG GCGCTATAGG AAAATTGCCA CCTCTGTCAC GGTCTACAGG GGCTTGACAG 78 0 

AGTCCGCCAT CACCAACAAG TATGAACTCC CGAGACCCGT GCCACTCTAT GAG AT AAGC C 84 0 

AC AT GG AC AG CACCTATCAG TGCTTTAGTT C CAT G AAGGT AAATGTCAAC GGGGTAGAAA 90 0 

ACACATTTAC T GAC AG AG AC GAT GT T AAC A CCACAGTATT CCTCCAACCA GTAGAGGGGC 96 0 

T T AC GG AT AA CATTCAAAGG TACTTTAGCC AGC C G GT CAT CTACGCGGAA CCCGGCTGGT 102 0 

TTCCCGGCAT ATACAGAGTT AGGACCACYG TCAATTGCGA GATAGTGGAC ATGATAGCCA 108 0 

GGTCTGCTGA ACCATACAAT TACTTTGTCA C GT C ACT GGG TGACACGGTG GAAGTCTCCC 114 0 

CTTTTTGCTA TAACGAATCC TCATGCAGCA CAACCCCCAG CAACAAAAAT GGCCTTAGCG 120 0 

TCCAAGTAGT TCTCAACCAC ACT GT GGT C A CGTACTCTGA C AG AGGAAC C AGTCCCACTC 126 0 

C CCAAAAC AG GATCTTTGTG G AAAC GGG AG CGTACACGCT TTCGTGGGCC T CC GAGAGC A 132 0 

AGACCACGGC CGTGTGTCCG CTGGCACTGT GGAAAAC CTT CCCGCGCTCC ATCCAGACTA 138 0 

CCCACGAGGA CAGCTTCCAC TTTGTGGCCA AC GAG AT C AC GGCCACCTTC ACGGCTCCTC 144 0 

TAACGCCAGT GGCCAACTTT ACC GAC AC GT ACTCTTGTCT GACCTCGGAT ATCAACACCA 150 0 

CGCTTAACGC CAGCAAGGCC AAACTGGCGA GCACTCACGT CCCTAACGGG ACGGTCCAGT 156 0 

ACTTCCACAC AACAGGCGGA CTCTATTTGG TCTGGCAGCC CATGTCCGCG ATTAACCTGA 162 0 



6,015,565 

181 182 

-continued 



CTCACGCTCA GGGCGACAGC GGGAACCCCA CGTCATCGCC GCCCCCCTCC GCATCCCCCA 168 0 

TGACCACCTC TGCCAGCCGC AGAAAGAGAC GGTCAGCCAG TACCGCTGCT GCCGGCGGCG 174 0 

GGGGGTCCAC GGACAACCTG T CT T AC AC GC AGCTGCAGTT TGCCTACGAC AAACTGCGGG 180 0 

ATGGCATTAA TCAGGTGTTA GAAGAACTCT CCAGGGCATG GTGTC GC GAG CAGGTCAGGG 186 0 

ACAACCTAAT GTGGTACGAG CTCAGTAAAA TCAACCCCAC CAGCGTTATG AC AGC CATC T 192 0 

ACGGTC GACC TGTATCCGCC AAGTTCGTAG GAG AC GC CAT TTCCGTGACC GAGTGCATTA 198 0 

ACGTGGACCA GAGCTCCGTA AACATCCACA AGAGC CT C AG AACCAATAGT AAGGACGTGT 2 04 0 

GTTACGCGCG CCCCCTGGTG ACGTTTAAGT TTTTGAACAG TTCCAACCTA TTCACCGGCC 2 10 0 

AGCTGGGCGC GCGCAATGAG AT AAT AC T G A CCAACAACCA GGTGGAAACC TGCAAAGACA 2 16 0 

CCTGCGAACA C T ACTT CATC ACCCGCAACG AGACTCTGGT GTATAAGGAC T AC GC GT AC C 2 22 0 

TGCGCACTAT AAAC AC C ACT G AC AT AT CCA CCCTGAACAC TTTTATCGCC CTGAATCTAT 2280 

CCTTTATTCA AAACATAGAC TTCAAGGCCA TCGAGCTGTA CAGCAGTGCA GAG AAAC GAC 2 34 0 

TCGCGAGTAG CGTGTTTGAC CTGGAGACGA TGTTCAGGGA GT AC AAC T AC T AC AC AC AT C 2 40 0 

GTCTCGCGGG TTTGCGCGAG GATCTGGACA ACACCATAGA TATGAACAAG GAGCGCTTCG 2 46 0 

T AAGGG AC TT GTC GGAGAT A GTGGCGGACC TGGGTGGCAT CGGAAAAACG GTKGTGAACG 2 52 0 

TGGCCAGC AG C GT GGT C ACT CTATGTGGCT CATTGGTTAC CGGATTCATA AATTTTATTA 2 58 0 

AACACCCCCT AGGTGGCATG C TG AT GAT C A T TAT C GT TAT AGCAATCATC C TG AT C ATT T 2 64 0 

T TAT GC TC AG TCGCCGCACC AAT AC CAT AG CCCAGGCGCC GGTGAAGATG ATCTACCCCG 2 70 0 

ACGTAGATCG CAGGGCACCT CCTAGCGGCG GAGCCCCAAC ACGGGAGGAA ATCAAAAACA 2 76 0 

TCCTGCTGGG AAT GC ACC AG CTACAACAAG AGG AG AG GC A GAAGGCGGAT GATYTGAAAA 2 82 0 

AAAGTACACC C TC GGT GT TT CAGCGTACCG CAAACGGCCT TCGTCAGCGT CTGAGAGGAT 2 88 0 

AT AAAC CT CT GAC TC AAT CG CTAGACATCA GTCYGGAAAC GGGGGAGTGA CAGTGGATTC 2 94 0 

GAGGTTATTG T TT GAT GT AA ATT T AGGAAA CACGGCCCGC CTCTGAAGCA C C AC AT AC AG 3 00 0 

ACTGCAGTTA TCAACCCTAC TCGTTGCACA CAGACACAAA T T ACC GT CC G CAGATC 3 05 6 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 135 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Gly Gly Met Phe Pro lie Gin Lys Met Met Val Ser Glu Met lie Trp 
15 10 15 

Pro Ser lie Glu Arg Lys Asp Trp lie Glu Pro Asn Phe Asn Gin Phe 
20 25 30 

Tyr Ser Phe Glu Asn Gin Asp lie Asn His Leu Gin Lys Arg Ala Trp 
35 40 45 

Glu Tyr lie Arg Glu Leu Val Leu Ser Val Ser Leu Asn Asn Arg Thr 
50 55 60 

Trp Glu Arg Glu Leu Lys lie Leu Leu Thr Pro Gin Gly Ser Pro Gly 
65 70 75 80 

Phe Glu Glu Pro Lys Pro Ala Gly Leu Thr Thr Gly Leu Tyr Leu Thr 

85 90 95 

Phe Glu lie Ser Ala Pro Leu Val Leu Val Asp Lys Lys Tyr Gly Trp 
100 105 110 
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He Phe Lys Asp Leu Tyr Ala Leu Leu Tyr His His Leu Gin Leu Ser 
115 120 125 

Asn His Asn Asp Ser Gin Val 

130 135 



(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 45 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

( ix ) FEATURE : 

(A) NAME /KEY: Modified- site 

(B) LOCATION: 841 

(D) OTHER INFORMATION: /note= "Proline or Leucine 
depending on codon" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Met Thr Pro Arg Ser Arg Leu Ala Thr Leu Gly Thr Val He Leu Leu 
15 10 15 

Val Cys Phe Cys Ala Gly Ala Ala His Ser Arg Gly Asp Thr Phe Gin 
20 25 30 

Thr Ser Ser Ser Pro Thr Pro Pro Gly Ser Ser Ser Lys Ala Pro Thr 
35 40 45 

Lys Pro Gly Glu Glu Ala Ser Gly Pro Lys Ser Val Asp Phe Tyr Gin 
50 55 60 

Phe Arg Val Cys Ser Ala Ser He Thr Gly Glu Leu Phe Arg Phe Asn 
65 70 75 80 

Leu Glu Gin Thr Cys Pro Asp Thr Lys Asp Lys Tyr His Gin Glu Gly 
85 90 95 

lie Leu Leu Val Tyr Lys Lys Asn lie Val Pro His lie Phe Lys Val 
100 105 110 

Arg Arg Tyr Arg Lys lie Ala Thr Ser Val Thr Val Tyr Arg Gly Leu 
115 120 125 

Thr Glu Ser Ala lie Thr Asn Lys Tyr Glu Leu Pro Arg Pro Val Pro 
130 135 140 

Leu Tyr Glu He Ser His Met Asp Ser Thr Tyr Gin Cys Phe Ser Ser 
145 150 155 160 

Met Lys Val Asn Val Asn Gly Val Glu Asn Thr Phe Thr Asp Arg Asp 
165 170 175 

Asp Val Asn Thr Thr Val Phe Leu Gin Pro Val Glu Gly Leu Thr Asp 
180 185 190 

Asn He Gin Arg Tyr Phe Ser Gin Pro Val He Tyr Ala Glu Pro Gly 
195 200 205 

Trp Phe Pro Gly He Tyr Arg Val Arg Thr Thr Val Asn Cys Glu He 
210 215 220 

Val Asp Met He Ala Arg Ser Ala Glu Pro Tyr Asn Tyr Phe Val Thr 
225 230 235 240 

Ser Leu Gly Asp Thr Val Glu Val Ser Pro Phe Cys Tyr Asn Glu Ser 
245 250 255 

Ser Cys Ser Thr Thr Pro Ser Asn Lys Asn Gly Leu Ser Val Gin Val 
260 265 270 

Val Leu Asn His Thr Val Val Thr Tyr Ser Asp Arg Gly Thr Ser Pro 

275 280 285 



Thr Pro Gin Asn Arg He Phe Val Glu Thr Gly Ala Tyr Thr Leu Ser 
290 295 300 
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Trp Ala Ser Glu Ser Lys Thr Thr Ala Val Cys Pro Leu Ala Leu Trp 
305 310 315 320 

Lys Thr Phe Pro Arg Ser lie Gin Thr Thr His Glu Asp Ser Phe His 
325 330 335 

Phe Val Ala Asn Glu lie Thr Ala Thr Phe Thr Ala Pro Leu Thr Pro 
340 345 350 

Val Ala Asn Phe Thr Asp Thr Tyr Ser Cys Leu Thr Ser Asp lie Asn 
355 360 365 

Thr Thr Leu Asn Ala Ser Lys Ala Lys Leu Ala Ser Thr His Val Pro 
370 375 330 

Asn Gly Thr Val Gin Tyr Phe His Thr Thr Gly Gly Leu Tyr Leu Val 
385 390 395 400 

Trp Gin Pro Met Ser Ala lie Asn Leu Thr His Ala Gin Gly Asp Ser 
405 410 415 

Gly Asn Pro Thr Ser Ser Pro Pro Pro Ser Ala Ser Pro Met Thr Thr 
420 425 430 

Ser Ala Ser Arg Arg Lys Arg Arg Ser Ala Ser Thr Ala Ala Ala Gly 
435 440 445 

Gly Gly Gly Ser Thr Asp Asn Leu Ser Tyr Thr Gin Leu Gin Phe Ala 
450 455 460 

Tyr Asp Lys Leu Arg Asp Gly lie Asn Gin Val Leu Glu Glu Leu Ser 
465 470 475 480 

Arg Ala Trp Cys Arg Glu Gin Val Arg Asp Asn Leu Met Trp Tyr Glu 
485 490 495 

Leu Ser Lys lie Asn Pro Thr Ser Val Met Thr Ala lie Tyr Gly Arg 
500 505 510 

Pro Val Ser Ala Lys Phe Val Gly Asp Ala lie Ser Val Thr Glu Cys 

515 520 525 

lie Asn Val Asp Gin Ser Ser Val Asn lie His Lys Ser Leu Arg Thr 
530 535 540 

Asn Ser Lys Asp Val Cys Tyr Ala Arg Pro Leu Val Thr Phe Lys Phe 
545 550 555 560 

Leu Asn Ser Ser Asn Leu Phe Thr Gly Gin Leu Gly Ala Arg Asn Glu 
565 570 575 

lie lie Leu Thr Asn Asn Gin Val Glu Thr Cys Lys Asp Thr Cys Glu 
580 585 590 

His Tyr Phe lie Thr Arg Asn Glu Thr Leu Val Tyr Lys Asp Tyr Ala 
595 600 605 

Tyr Leu Arg Thr lie Asn Thr Thr Asp lie Ser Thr Leu Asn Thr Phe 
610 615 620 

lie Ala Leu Asn Leu Ser Phe lie Gin Asn lie Asp Phe Lys Ala lie 
625 630 635 640 

Glu Leu Tyr Ser Ser Ala Glu Lys Arg Leu Ala Ser Ser Val Phe Asp 
645 650 655 

Leu Glu Thr Met Phe Arg Glu Tyr Asn Tyr Tyr Thr His Arg Leu Ala 
660 665 670 

Gly Leu Arg Glu Asp Leu Asp Asn Thr lie Asp Met Asn Lys Glu Arg 
675 680 685 

Phe Val Arg Asp Leu Ser Glu lie Val Ala Asp Leu Gly Gly lie Gly 
690 695 700 

Lys Thr Val Val Asn Val Ala Ser Ser Val Val Thr Leu Cys Gly Ser 
705 710 715 720 

Leu Val Thr Gly Phe lie Asn Phe lie Lys His Pro Leu Gly Gly Met 
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Leu Met lie lie lie Val lie Ala lie lie Leu lie lie Phe Met Leu 



Ser Arg Arg Thr Asn Thr lie Ala Gin Ala Pro Val Lys Met lie Tyr 

755 760 765 

Pro Asp Val Asp Arg Arg Ala Pro Pro Ser Gly Gly Ala Pro Thr Arg 

770 775 730 

Glu Glu lie Lys Asn lie Leu Leu Gly Met His Gin Leu Gin Gin Glu 



Glu Arg Gin Lys Ala Asp Asp Leu Lys Lys Ser Thr Pro Ser Val Phe 
805 810 815 

Gin Arg Thr Ala Asn Gly Leu Arg Gin Arg Leu Arg Gly Tyr Lys Pro 

820 825 830 

Leu Thr Gin Ser Leu Asp lie Ser Xaa Glu Thr Gly Glu 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 185 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Met Asp Phe Phe Asn Pro Phe lie Asp Pro Thr Arg Gly Gly Pro Arg 
15 10 15 

Asn Thr Val Arg Gin Pro Thr Pro Ser Gin Ser Pro Thr Val Pro Ser 
20 25 30 

Glu Thr Arg Val Cys Arg Leu lie Pro Ala Cys Phe Gin Thr Pro Gly 
35 40 45 

Arg Pro Gly Val Val Ala Val Asp Thr Thr Phe Pro Pro Thr Tyr Phe 
50 55 60 

Gin Gly Pro Lys Arg Gly Glu Val Phe Ala Gly Glu Thr Gly Ser lie 
65 70 75 80 

Trp Lys Thr Arg Arg Gly Gin Ala Arg Asn Ala Pro Met Ser His Leu 
85 90 95 

lie Phe His Val Tyr Asp lie Val Glu Thr Thr Tyr Thr Ala Asp Arg 
100 105 110 

Cys Glu Asp Val Pro Phe Ser Phe Gin Thr Asp lie lie Pro Ser Gly 
115 120 125 

Thr Val Leu Lys Leu Leu Gly Arg Thr Leu Asp Gly Ala Ser Val Cys 
130 135 140 

Val Asn Val Phe Arg Gin Arg Cys Tyr Phe Tyr Thr Leu Ala Pro Gin 
145 150 155 160 

Gly Val Asn Leu Thr His Val Leu Gin Gin Ala Leu Gin Ala Gly Phe 
165 170 175 

Gly Arg Ala Ser Cys Gly Phe Ser Thr 
180 185 



(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



( ix ) FEATURE : 
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(A) NAME /KEY : CDS 

(B) LOCATION: 1..3 84 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

GTG TAC AAG AAG AAC ATC GTG CCT AAC ATG TTC AAG GTA CGC AGG TAC 

Val Tyr Lys Lys Asn lie Val Pro Asn Met Phe Lys Val Arg Arg Tyr 

15 10 15 

AGA AAA GTA GCA ACG CCT GTC ACA CTC TAC CGC GGT ATG ACA GAC GCA 

Arg Lys Val Ala Thr Pro Val Thr Leu Tyr Arg Gly Met Thr Asp Ala 

20 25 30 

GCA ATA ACT AAC AAA TAT GAA ATT CCC AGA CCC GTA CCA CTA TAC GAG 

Ala lie Thr Asn Lys Tyr Glu lie Pro Arg Pro Val Pro Leu Tyr Glu 

35 40 45 

ATC AGT CAC ATG GAC AGC ACC TAC CAG TGC TTT AGT TCC ATG AAA ATT 

lie Ser His Met Asp Ser Thr Tyr Gin Cys Phe Ser Ser Met Lys lie 

50 55 60 

GTA GTG AAC GGA GTC GAA AAC ACG TTC ACC GGT CGG GAT GAC GTA AAC 

Val Val Asn Gly Val Glu Asn Thr Phe Thr Gly Arg Asp Asp Val Asn 

65 70 75 80 

AAA AGC GTA TTT CTC CAG CCA GTC GAA GGT CTA ACT GAC AAC ATA AAG 

Lys Ser Val Phe Leu Gin Pro Val Glu Gly Leu Thr Asp Asn lie Lys 

85 90 95 

AGA TAC TTT AGC CAG CCA GTG CTA TAT TCT GAA CCC GGA TGG TTT CCA 

Arg Tyr Phe Ser Gin Pro Val Leu Tyr Ser Glu Pro Gly Trp Phe Pro 

100 105 110 

GGT ATC TAC AGG GTT AGG ACA ACA GTT AAT TGT GAG ATT GTA GAC ATG 

Gly lie Tyr Arg Val Arg Thr Thr Val Asn Cys Glu lie Val Asp Met 

115 120 125 

TT 



(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 7 : 

Val Tyr Lys Lys Asn lie Val Pro Asn Met Phe Lys Val Arg Arg Tyr 
15 10 15 

Arg Lys Val Ala Thr Pro Val Thr Leu Tyr Arg Gly Met Thr Asp Ala 
20 25 30 

Ala lie Thr Asn Lys Tyr Glu lie Pro Arg Pro Val Pro Leu Tyr Glu 
35 40 45 

lie Ser His Met Asp Ser Thr Tyr Gin Cys Phe Ser Ser Met Lys lie 
50 55 60 

Val Val Asn Gly Val Glu Asn Thr Phe Thr Gly Arg Asp Asp Val Asn 
65 70 75 80 

Lys Ser Val Phe Leu Gin Pro Val Glu Gly Leu Thr Asp Asn lie Lys 
85 90 95 

Arg Tyr Phe Ser Gin Pro Val Leu Tyr Ser Glu Pro Gly Trp Phe Pro 
100 105 110 

Gly lie Tyr Arg Val Arg Thr Thr Val Asn Cys Glu lie Val Asp Met 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 98: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 26 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
ATGTTCAGGG AGTACAACTA CTACAC 2 6 



(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

lie Tyr Ala Glu Pro Gly Trp Phe Pro Gly lie Tyr Arg Val Arg Thr 
15 10 15 

Thr Val Asn Cys Glu 
20 



(2) INFORMATION FOR SEQ ID NO: 10 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 0: 

Val Leu Glu Glu Leu Ser Arg Ala Trp Cys Arg Glu Gin Val Arg Asp 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 10 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 1: 

Met Thr Pro Arg Ser Arg Leu Ala Thr Leu Gly Thr Val lie Leu Leu 
15 10 15 

Val Cys Phe Cys Ala Gly Ala Ala His Ser Arg Gly Asp Thr Phe Gin 
20 25 30 

Thr Ser Ser Ser Pro Thr Pro Pro Gly Ser Ser Ser Ser Lys Ala Pro 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 10 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 2 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Val Pro Asn Lys His Leu Leu Leu lie Leu Ser Phe Ser Thr Ala 
15 10 15 

Cys Gly Gin Thr Thr Pro Thr Thr Ala Val Glu Lys Asn Lys Thr Gin 
20 25 30 



Ala lie Tyr Gin Glu Tyr Phe Lys Tyr Arg 
35 40 
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(2) INFORMATION FOR SEQ ID NO: 10 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 3: 

Met Tyr Tyr Lys Thr lie Leu Phe Phe Ala Leu lie Lys Val Cys Ser 
15 10 15 

Phe Asn Gin Thr Thr Thr His Ser Thr Thr Thr Ser Pro Ser lie Ser 
20 25 30 

Ser Thr Thr Ser Ser Thr Thr Thr Ser Thr Ser Lys Pro 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 10 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 4: 

Met Tyr Pro Thr Val Lys Ser Met Arg Val Ala His Leu Thr Asn Leu 
15 10 15 

Leu Thr Leu Leu Cys Leu Leu Cys His Thr His Leu Tyr Val Cys Gin 
20 25 30 

Pro Thr Thr Leu Arg Gin Pro Ser Asp Met Thr Pro Ala Gin Asp Ala 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 10 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Met Thr Arg Arg Arg Val Leu Ser Val Val Val Leu Leu Ala Ala Leu 
15 10 15 

Ala Cys Arg Leu Gly Ala Gin Thr Pro Glu Gin Pro Ala Pro Pro Ala 
20 25 30 

Thr Thr Val Gin Pro Thr Ala Thr Arg Gin Gin Leu Ser Val Val Val 
35 40 45 



(2) INFORMATION FOR SEQ ID NO: 10 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 6: 

Cys Ala Gly Ala Ala His Ser Arg Gly Asp Thr Phe Gin Thr Ser Ser 
15 10 15 

Ser Pro Thr 



(2) INFORMATION FOR SEQ ID NO: 10 7: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 7: 

Asn lie Met Glu lie Leu Arg Gly Asp Phe Ser Ser Ala Asn Asn Arg 
15 10 15 

Asp Asn 



(2) INFORMATION FOR SEQ ID NO: 10 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 8: 

Ser Ser Thr Ser Tyr Asn Arg Gly Asp Ser Thr Phe Glu Ser Lys Ser 
15 10 15 

Tyr Lys 



(2) INFORMATION FOR SEQ ID NO: 10 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10 9: 

Ala Leu Gly Gly Asp Val Glu Lys Arg Gly Asp Arg Glu Glu Ala His 
15 10 15 

Val Pro Phe Phe 
20 



(2) INFORMATION FOR SEQ ID NO : 1 1 0 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 0 : 

Cys Gin Ala Gly Thr Phe Ala Leu Arg Gly Asp Ser Thr Phe Glu Glu 
15 10 15 

Ser Lys Ser 



(2) INFORMATION FOR SEQ ID NO : 1 1 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 1 : 

lie Thr Val Tyr Ala Val Thr Gly Arg Gly Asp Ser Pro Ala Ser Ser 
15 10 15 



Lys Pro lie Ser 
20 



197 
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(2) INFORMATION FOR SEQ ID NO : 1 1 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Cys Glu Val Val Thr Gly Ser Pro Arg Gly Asp Ser Gin Ser Ser Trp 
15 10 15 

Lys Ser Val Gly 
20 



(2) INFORMATION FOR SEQ ID NO : 1 1 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 3 : 

Cys Lys Pro Gin Val Thr Arg Gly Asp Val Phe Thr Met Pro Glu Asp 
15 10 15 

Glu Tyr 



What is claimed is: 

1. An isolated polynucleotide comprising a sequence as 
set forth as nucleotides 36 to 354 of SEQ. ID NO:l or SEQ. 
ID NO:3. 

2. An isolated polynucleotide comprising a sequence 
selected from the group consisting of: SEQ. ID NO:41, SEQ. 
ID NO:43, SEQ. ID NO.:45 and SEQ. ID NO:46. 

3. An isolated polynucleotide comprising a sequence set 
forth in a member of the group consisting of nucleotides 36 
to 354 inclusive of SEQ. ID NOl, nucleotides 36 to 354 
inclusive of SEQ ID NO: 3, nucleotides 36 to 354 inclusive 
of SEQ ID NO:92 and SEQ. ID NO:96. 

4. An isolated or non-naturally occurring polynucleotide 
encoding a polypeptide comprising a sequence as set forth in 
a member of the group consisting of amino acids 13 to 118 
inclusive of SEQ ID NO: 2, amino acids 13 to 118 inclusive 
of SEQ ID NO:4, amino acids 13 to 118 inclusive of SEQ 
ID NO:97, and SEQ ID NO: 94. 

5. A recombinant cloning or expression vector comprising 
the polynucleotide of claim 4. 

6. A host cell transformed by the polynucleotide of claim 

4. 

7. An oligonucleotide selected from the group consisting 
of SEQ. ID NOS:24-63, SEQ. ID NOS.77-78, and SEQ. ID 
NOS:80-90. 

8. An isolated polynucleotide, where said polynucleotide 
is capable of hybridizing under conditions of high stringency 
with a second polynucleotide comprising a sequence 
selected from the group consisting of SEQ. ID NOS:l, 3, 92, 
and 94, and their respective complementary sequences, but 
is not capable of hybridizing under conditions of high 
stringency with a polynucleotide having a sequence of any 
of SEQ. ID NOS:5-13. 

9. The isolated polynucleotide of claim 8, the nucleotide 
sequence of which is contained in the genome of a naturally 
occurring virus. 

10. A monoclonal or isolated polyclonal antibody specific 
for a Glycoprotein B polypeptide encoded in said encoding 
region of the polynucleotide of claim 1. 



30 11. A monoclonal or isolated polyclonal antibody specific 
for the polypeptide encoded by the polynucleotide of claim 
4. 

12. A diagnostic kit for detecting a herpes virus poly- 
nucleotide in a biological sample, comprising a reagent in 

35 suitable packaging, wherein the reagent comprises the poly- 
nucleotide of claim 3. 

13. A diagnostic kit for detecting a herpes virus polypep- 
tide present in a biological sample, comprising a reagent in 
suitable packaging, wherein the reagent comprises the anti- 

40 body of claim 11. 

14. A method of inhibiting attachment of a herpes virus to 
a cell, comprising contacting the cell with a polypeptide 
encoded by the polynucleotide of claim 4, wherein said 
polypeptide comprises an arginine-glycine-aspartic acid 

45 sequence. 

15. A method of detecting infection of an individual by a 
herpes virus, comprising the steps of: 

a) contacting antibody from a sample obtained from the 
individual with the polypeptide encoded by the poly- 

50 nucleotide of claim 4 under conditions that permit the 

formation of a stable antigen- antibody complex; and 

b) detecting said stable complexes formed in step a), if 
any. 

55 16. A method of detecting infection of an individual by a 
herpes virus, comprising the steps of: 

a) contacting a polypeptide from a sample obtained from 
the individual with the antibody of claim 11 under 
conditions that permit the formation of a stable antigen- 
go antibody complex; and 

b) detecting said stable complexes formed in step a), if 
any. 

17. A method of producing a Glycoprotein B polypeptide, 
comprising expressing the polynucleotide of claim 4 in a 
65 eukaryotic cell. 



